On Thu, 12 Jun 2014, davidson@ling.ohio-state.edu wrote:PS: btw, uzbl has a relatively steep learning curve.
On Thu, 12 Jun 2014, lina wrote:
Hi,[snip]
I wish to grab part of the CDS entry from
http://www.ncbi.nlm.nih.gov/nuccore/KF699528.2
namely,
"MLDHSSVNSTIAPGNLLNLPVWCYLLETEEGPILVDTGMPESAV
NNEGLFNGTFVEGQILPKMTEEDRIVNILKRVGYEPDDLLYIISSHLHFDHAGGNGAF
TNTPIIVQRTEYEAALHREEYMKECILPHLNYKIIEGDYEVVPGVQLLYTPGHSPGHQ
SLFIETEQSGSILLTIDASYTKENFEDEVPFAGFDPELALSSIKRLKEVVAKEKPIIF
FGHDIEQEKGCKVFPEYIPRAE"
so it is going to be nice to know how to get these html plain file which
contains these sequence,
can anyone points out something to let me go further,
using uzbl browser, along with either of the scripts on this page...
http://www.uzbl.org/wiki/dump
...i think this can be done. (you can have your choice of html or
plain text.)
if you are in a hurry, here is a cludge that should do what you want:
jarjar@hell:~$ nuccore_fname=KF699528.2
jarjar@hell:~$ uzbl http://www.ncbi.nlm.nih.gov/nuccore/${nuccore_fname} 2>${nuccore_fname}_uzbl_squawks &
[1] 2768
jarjar@hell:~$ uzbl_pid=$!
jarjar@hell:~$ echo 'js document.documentElement.outerHTML' | socat - unix-connect:/tmp/uzbl_socket_${uzbl_pid} > ${nuccore_fname}_done.html
jarjar@hell:~$ grep -A 4 '/translation=' ${nuccore_fname}_done.html
/translation="MLDHSSVNSTIAPGNLLNLPVWCYLLETEEGPILVDTGMPESAV
NNEGLFNGTFVEGQILPKMTEEDRIVNILKRVGYEPDDLLYIISSHLHFDHAGGNGAF
TNTPIIVQRTEYEAALHREEYMKECILPHLNYKIIEGDYEVVPGVQLLYTPGHSPGHQ
SLFIETEQSGSILLTIDASYTKENFEDEVPFAGFDPELALSSIKRLKEVVAKEKPIIF
FGHDIEQEKGCKVFPEYIPRAE"
if uzbl's complaints about the webpage don't interest you, replace
2>${nuccore_fname}_uzbl_squawks with 2>/dev/null.
anyways, would be interesting to hear what solutions you find.Archive: [🔎] alpine.DEB.2.02.1406131037580.15974@brutus.ling.ohio-state.edu" target="_blank">https://lists.debian.org/alpine.DEB.2.02.1406131037580.15974@brutus.ling.ohio-state.edu
-wes
--
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org