On 9/22/12 6:01 PM, craig@gtek.biz wrote:
Greetings,
I have a small book collection (~150) that I thought would be neat to
catalog by the Library of Congress catalog numbers. I have found a
LOC search form that will allow me to input the ISBN, and it will
return the information I want:
[code]http://www.loc.gov/cgi-bin/zgate?ACTION=INIT&FORM_HOST_PORT=/prod/www/data/z3950/locils2.html,z3950.loc.gov,7090[/code]
I have the list of book ISBNs in a text file, so scripting this
should be quite easy. The problem is I can't figure out how to submit
the form from the command line. I figured wget would be the best way,
but everything I try results in downloading a single line that reads
"Your form didn't include an ACTION!" So I thought I would turn to
here for help. The test ISBN I am using is for The Linux Cookbook:
1886411484, QA76.76.O63S788 2001.
[snip]
If you want to screen scrape, the URI would be like this:
http://www.loc.gov/cgi-bin/zgate?ACTION=SEARCH&DBNAME=VOYAGER&ESNAME=B&MAXRECORDS=20&RECSYNTAX=1.2.840.10003.5.10&REINIT=/cgi-bin/zgate?ACTION=INIT&FORM_HOST_PORT=/prod/www/data/z3950/locils2.html,z3950.loc.gov,7090&srchtype=1,1016,2,102,3,3,4,2,5,100,6,1&SESSION_ID=4493330&TERM_1=1886411484
However, the session ID expires after only a few minutes so you will
need a fresh one.
Regards,
/Lars