Quantcast
Channel: web – Gadget Magazine
Viewing all articles
Browse latest Browse all 21

Command line web browsing

$
0
0

From almost every app being on the command line to doing everything through the web browser, GNU/Linux has come a long way towards user-friendliness. But in always using that ever-present Firefox or Chromium session, something has been lost along the way.

Every new tab opened on the browser is time wasted in mouse operations and in seconds ticking away for the World Wide Wait for some AJAX-heavy page to load. Yet, just as many GUI apps have arguably better equivalents on the command line, so too do many daily operations you carry out on the web have quicker terminal equivalents that can save you time.

We’re not just talking about saving a couple of seconds; going from a SSH session, checking logs on your server, to opening a web browser for a search on something involves moving concentration away from your project, as the sight of all of your open tabs beckons you to a multitude of distractions.

Remember, this isn’t about replacing GUI apps with terminal ones – we’re not covering browsers and IRC clients here; it’s about getting things done on the web with a quick command in your terminal. We’ll cover downloading and sharing, but let’s start with where commands should be a natural fit: searching the web.

Surfraw used the command line to search over a hundred engines and resources, leaving the browser for browsing
Surfraw used the command line to search over a hundred engines and resources, leaving the browser for browsing

Resources

Surfraw
cURL
wget
aria2
youtubedl
get_iplayer
get_flash_videos

Step-by-step

Step 01 Before WikiLeaks

Surfraw stands for the Shell User’s Revolutionary Front Rage Against the Web, and was written by Julian Assange many years before he became better known for another project. Surfraw is installable through your package manager and it will bring web searches to the command line.

Step 02 Surfraw

Putting search on the command line is a good fit, as you simply put:

sr google raspberry pi

…and you’ll be looking at Google search results for Raspberry Pi in a sensible default browser (w3m on most Ubuntu systems). Other command line, or GUI, browsers can be set in the config file (note: all file locations given may vary depending on the distro).

Step 03 Elvi search scripts

You can see more than a hundred available search options with:

sr -elvi

Elvi are the search scripts for various engines or sites. You’ll find them in /usr/lib/surfraw/ and they, as well as surfraw options and arguments, are tab-completable.

Step 04 Changing web

While some defaults are growing out of date – the late, lamented ntk and freshmeat feature are just two examples – Surfraw is still ready to go with many still useful search directories and is still being updated, with GitHub and jQuery docs among those added in the last release. Creating your own is left as an exercise for the reader.

Step 05 Def and defyn

The commented config file is /etc/xdg/ surfraw/conf – def and defyn are used here to define variables. The latter defines Boolean values such as:

defyn SURFRAW_graphical no

You can create per-user scripts in ~/.config/ surfraw/conf with sh-style entries:

SURFRAW_graphical=no

Step 06 In your script

The other side of the command line is shell scripting, to chain together utilities in repeatable programs. For this, Surfraw has a -p option to pass the URL to STDOUT instead of the default browser and an -o option to specify a text file to dump the browser’s html.

sr -p rhyme -method=perfect orange

Step 07 Get Wget

You’ve probably used GNU Wget before to grab a particular file or binary resource from a remote server. Add the -O option to specify a destination:

$ wget -O ~/bin/dropbox.py “https:// www.dropbox.com/download?dl=packages/dropbox.py”

Step 08 Fetch and clone

The two most useful options are -c, to resume an interrupted download (even one started by another program), and -r, which is a recursive fetch to a default depth of five directory levels, enabling you to fetch or clone whole websites.

Step 09 Tips and tricks

Wget may be more primitive than the two rivals on the next page, but you’ll find many Wget tricks for working around blockages to downloads, so you can grab a particular resource from, say, your command-line-only server. The -e switch enables many useful commands:

wget -e robots=off

Step 10 cURL fetching

Handy as Wget is, cURL is a far more flexible fetching friend and it sends too. It’s very invaluable for quickly checking the state of your sites with:

curl -I gonetoearth.org

curl -I passes the headers of a site to the terminal.

Step 11 Two-way street

cURL writes by default to STDOUT, which is handier for scripting, but -O will save the resource and a lowercase -o lets you specify a name to save as. When you’re directing the output away from the terminal, cURL displays a progress meter there.

Credentials can be passed with -u to both http and ftp sites, and uploads to the latter made with the -T switch.

curl -u username:password -T “{file1,file2}” ftp://ftp.myserver.com -T {“patch1,module1”} ftp://ftp.mywebserver.com

Curl -X lets you specify PUT or POST methods instead of GET, for testing site features, even multipart forms.

Step 12 Change the MOTD

Looking for a change from your distro’s usual MOTD (the message that greets you upon login)? Let cURL grab you a headline, joke or anything else from the the multitudinous resources of the web.

This command, for example, courtesy of bashoneliners.com, will give you a randomised string of corporate management jargon which may well be indistinguishable from recent communiqués from your bosses:

curl -s http://cbsg.sourceforge.net/cgi- bin/live | grep -Eo ‘^
  • .*
  • ’ | sed s,\?li>,,g | shuf -n 1
    aria2c --seed-time=120 --seed-ratio=1.0

    Step 13 Grab with aria2

    Wget is installed by default almost everywhere and cURL is attaining default status too. By contrast, aria2 is not so well-known, but it’s a good way of grabbing the latest ISO – or any file or software, as metalink tries to look for the best version by location, language and OS.

    Step 14 Share the (down)load

    Aria2 works with torrents, which remain the best way for downloading distros. Everything from upload throttling to share ratio can be specified on the command line.

    aria2c --seed-time=120 --seed-ratio=3.0 http://releases.ubuntu.com/14.04.1/ubuntu- 14.04.1-server-amd64.iso.torrent
    Step 15 Don’t repeat. Config it Aria2’s config file saves you retyping command line options such as where you want downloads placed, the rate limits for torrents, and the log level. Uncomment and change the defaults as needed – but if your distro doesn’t install a config file set your own:
    log-level=warn
    max-connection-per-server=4
    min-split-size=5M
    on-download-complete=exit
    listen-port=60000
    dht-listen-port=60000
    seed-ratio=2.0
    max-upload-limit=50K

    Whether aria2, Wget or anything else, using the same options twice is a strong hint that you should start to open up the config file and set some sensible defaults for your most common actions.

    Step 16 On the Beeb

    Get_iplayer is a handy little Perl script that, almost since the launch of the BBC’s iPlayer service, has brought programme catch- up to non-x86 platforms and those without a fast enough connection to stream in real time.

    It occasionally has to play catch-up with changes to the service and, as we go to press, the BBC has dropped the programme data feeds that gave get_iplayer search and PVR capabilities. See the get_iplayer site for any updates on this.

    Step 17 Download by number

    Get_iplayer still works with the pid you see embedded in the iPlayer web page URI for each programme you might want to download, so although you’ll need to browse the website until there’s a workaround, you can at least grab a programme like this:

    get-iplayer --no-purge --pid p01x5k4n

    Step 18 YouTube downloader

    YouTube is a massive knowledge repository, containing instructional videos on everything from Beagle Boards to natural swimming pools (ie big ponds). They’re great for a long train journey where an intermittent Internet connection would make life difficult. Download ahead of time with youtube-dl (which also works with some other sites); just feed it the URL:

    $ youtube-dl http://youtube.com/watch?v=za8FMIWYtUc

    If older versions give a 403 error, update or change https to http in the command, as above.

    Step 19 Flash without the web

    Get_flash_videos will usually help on sites where youtube-dl fails, but not always. With both apps, get into the habit of double-quoting URLs, so the shell doesn’t try and interpret special characters like &.

    Step 20 Shared storage and cloud services

    Free and open cloud services are appearing with the burgeoning IndieTech movement, but Dropbox is still the service that most of us have accounts on – particularly as we often have to share files with other users for work. It’s a reasonable place to keep extra copies of config files you share across machines, for example.

    The command-line Dropbox script, which starts the service with Dropbox start, saves you running the resource-hogging Nautilus. Use symbolic links to save from disrupting your normal file hierarchy:

    $ cd ~
    $ mkdir Dropbox/.emacs.d
    $ ln -s Dropbox/.emacs.d

    Avoiding Dropbox and others with proprietary components usually means setting up your own Cloud server, but Seafile, which is aimed at collaborating teams, offers 1GB free at seacloud.cc. It also offers software for your own server. Seafile is hosted on Amazon Web Services and written in Python; it’s well worth comparing with other ‘own cloud’ solutions.

    Step 21 Mail servers

    So we’re browsing, downloading and sharing without the Browser, but don’t forget command-line email goes back decades before the web. Mutt is still one of the most efficient mailers out there - whether you’re on Gmail, or proudly run your own mail server.

    Whether you’re using the built-in mail (you may need to install mailutils) or go with Mutt, the syntax is similar:

    $ mail -s “Hello, World!” hi@gmail.com 
    

    Step 22 Browser commands

    If you like the power of the commandline, but really spend more time in a browser than a terminal, try YubNub - a command-line-style web interface to search engines and more. Check out yubnub.org/kernel/most_used_commands to see the most popular of the tens of thousands of user-contributed commands.


    Viewing all articles
    Browse latest Browse all 21

    Trending Articles