Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
wget - filename after redirect [SOLVED]
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Networking & Security
View previous topic :: View next topic  
Author Message
alex.blackbit
Advocate
Advocate


Joined: 26 Jul 2005
Posts: 2397

PostPosted: Mon Sep 05, 2011 5:14 pm    Post subject: wget - filename after redirect [SOLVED] Reply with quote

hi,

i now ask here because i didn't find a solution elsewhere. i have a shell script that downloads quite a list of URLs, all from the same site, all redirect. the name of the downloaded file is that of the original URL, not of the URL after redirect. I would like to have the files named as the URLs calls the file _after_ the redirect. is that possible ?


Last edited by alex.blackbit on Tue Sep 06, 2011 11:22 am; edited 1 time in total
Back to top
View user's profile Send private message
truc
Advocate
Advocate


Joined: 25 Jul 2005
Posts: 3199

PostPosted: Mon Sep 05, 2011 5:38 pm    Post subject: Reply with quote

Code:
while read url ; do
wget -O "${url##*/}" "$url"
done<URL_LIST.txt


:?:
_________________
The End of the Internet!
Back to top
View user's profile Send private message
alex.blackbit
Advocate
Advocate


Joined: 26 Jul 2005
Posts: 2397

PostPosted: Mon Sep 05, 2011 7:08 pm    Post subject: Re: wget - filename after redirect Reply with quote

alex.blackbit wrote:
_after_ the redirect. is that possible ?

trac, thanks for your answer, but i think you got me wrong.
Back to top
View user's profile Send private message
Hu
Administrator
Administrator


Joined: 06 Mar 2007
Posts: 23062

PostPosted: Mon Sep 05, 2011 9:52 pm    Post subject: Reply with quote

Use wget --trust-server-names. If I recall correctly, there are some potential security concerns with this in that a malicious server could refer you to a URL which causes you to overwrite something important.
Back to top
View user's profile Send private message
truc
Advocate
Advocate


Joined: 25 Jul 2005
Posts: 3199

PostPosted: Tue Sep 06, 2011 6:48 am    Post subject: Re: wget - filename after redirect Reply with quote

alex.blackbit wrote:
alex.blackbit wrote:
_after_ the redirect. is that possible ?

trac, thanks for your answer, but i think you got me wrong.


Oh 8O , you're right, sorry :)
_________________
The End of the Internet!
Back to top
View user's profile Send private message
alex.blackbit
Advocate
Advocate


Joined: 26 Jul 2005
Posts: 2397

PostPosted: Tue Sep 06, 2011 10:54 am    Post subject: Reply with quote

Hu wrote:
Use wget --trust-server-names. If I recall correctly, there are some potential security concerns with this in that a malicious server could refer you to a URL which causes you to overwrite something important.

Code:

       --trust-server-names
           If this is set to on, on a redirect the last component of the redirection URL will
           be used as the local file name.  By default it is used the last component in the
           original URL.

yes. it's really amazing how one can overlook such obvious things in a manpage.
thanks a lot for the pointer!
wget now uses the last file name it sees (where it was redirected too).
surprisingly the result is very different from firefox.
unfortunately the result of firefox is desirable in contrast to the result of wget. :(

i am talking about the flac files from here.
when i click in firefox on a download button (the down arrow), the file is saved as e.g. "2011.05.07 - Essential Mix - Seth Troxler.flac".
with wget --trust-server-names the file is saved as this fucking string "oBPmrhuqWBJV?AWSAccessKeyId=AKIAJBHW5FB4ERKUQUOQ&Expires=1315306316&Signature=WYbHo1xaEnWc5ECCqQFuK9BFQxA=&__gda__=1315306316_05d3a805fa3acfa25baf7f55c7de46d8".
with plain, optionless wget the file is just saved as "download", as in the original URL.

any ideas left?
Back to top
View user's profile Send private message
alex.blackbit
Advocate
Advocate


Joined: 26 Jul 2005
Posts: 2397

PostPosted: Tue Sep 06, 2011 11:21 am    Post subject: Reply with quote

i got it.
the missing advice was found when looking at the http traffic.
the filename is contained in a "content disposition" header.
wget has a command line option --content-disposition, marked experimental, but working as expected.
Code:
       --content-disposition
           If this is set to on, experimental (not fully-functional) support for
           "Content-Disposition" headers is enabled. This can currently result in extra
           round-trips to the server for a "HEAD" request, and is known to suffer from a few
           bugs, which is why it is not currently enabled by default.

           This option is useful for some file-downloading CGI programs that use
           "Content-Disposition" headers to describe what the name of a downloaded file
           should be.

again, the hints are in the manpage. what a shame.

as a reference, i am using this commandline:
Code:

$ lynx -dump -listonly "http://soundcloud.com/das-boy/sets/essential-mix/" | grep "^[[:digit:]]\{1,4\}\.\ http" | grep "/download$" | awk '{ print $2; }' | while read i; do wget --content-disposition "${i}"; done

thanks for your thoughts.
Back to top
View user's profile Send private message
Hu
Administrator
Administrator


Joined: 06 Mar 2007
Posts: 23062

PostPosted: Wed Sep 07, 2011 2:16 am    Post subject: Reply with quote

alex.blackbit wrote:
Code:
$ lynx -dump -listonly "http://soundcloud.com/das-boy/sets/essential-mix/" | grep "^[[:digit:]]\{1,4\}\.\ http" | grep "/download$" | awk '{ print $2; }' | while read i; do wget --content-disposition "${i}"; done

thanks for your thoughts.
You could compact that down to a single gawk with two patterns instead of a pair of greps and an awk. Try (untested):
Code:
lynx ... | gawk '/^[[:digit:]]\{1,4\}\.\ http.*\/download$/ { print $2; }' | while ...
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Networking & Security All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum