How to forward a port through a jump box

When you have set up Apache Spark and use Jupyter to run analyses on it, you’ll need to connect to the Jupyter notebooks by forwarding the port the notebooks run on to your local machine.
Depending on how the server that runs Spark is secured, you might need to do that through a “jump box”, a server that is hardened to prevent unauthorized access and let’s you access a network that’s otherwise not directly accessible from the Internet.

If you’re as untrained in using ssh as I am, it can be a bit frustrating to set that up yourself because it’s not entirely obvious when googling around. In the tradition of writing things up so that I don’t have to google them over and over again, here’s how to do it.

The first thing to know: there’s a file called ~/.ssh/config where you can “store” ssh connections instead of typing them in manually all the time. That’s what makes it possible to type ssh my-server and access your server instead of ssh my-username@my-host-address -i /path/to/ssh-key-file. Blew my mind when I learned this.

So, open ~/.ssh/config in your editor of choice, then add the following:

Host jump-box
  User your-user-name-on-jump-box
  IdentityFile /Users/local-user-name/.ssh/ssh_key_file_for_jump-box
  ForwardAgent yes

Host jupyter-box
  User your-username-on-jupyter-server
  ForwardAgent yes
  ProxyCommand ssh -q jump-box nc address-of-jupyter-box 22
  IdentityFile /Users/local-user-name/.ssh/ssh_key_for_jupyter-box

When you’ve added this to your ~/.ssh/config file, all you need to do to connect to the protected server and forward the port to access your Jupyter notebooks is:

$ ssh -L 8889:localhost:8889 jupyter-box

In this case, we’re forwarding port 8889, which is the port that my Jupyter notebooks are running on.

Tada, done!

Special thanks to this Unix Stackexchange discussion and users Naftuli Kay and Celada for writing up their solutions 💪

Lukas Kawerau
Lukas Kawerau
Data Engineer, Writer