BibTeX database from PDFs via DOI

This somewhat-ridiculous BASH one-liner will create a BibTeX database file (.bib) from a bunch of PDFs via the Crossref API for DOIs, providing the PDF has a DOI on the first page.  As DOI was introduced in 2000, this will probably not work on vintage PDFs.

for pdfs in *.pdf; do pdftotext -f 1 -l 1 "$pdfs" - | grep -oE "doi:\s?[A-Za-z0-9./-]+" | sed -r 's;doi:\s?;http://api.crossref.org/works/;g' | sed -r 's;$;/transform/application/x-bibtex;g' | xargs curl -fsS 2>/dev/null | sed -e '$a\'; done > allpdf.bib
Advertisements

Leave a comment...

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s