Subject: refer to BiBTeX conversion Further to Peter King's message in TeXhax Digest V88 #79, I have a program which is also based on "awk" and "sed", called "ref2bib" which will convert "refer" format to "BibTeX" format. The heuristics have been developed by applying the program to a number of "refer" format databases here at Oxford. Ref2bib can produce most of the types of record which are allowed by BibTeX. Since it is a shell script, ref2bib can easily be modified if required. Additionally, if a file called ".ref2bib" exists in the user's home directory, then this is used as a "sed" script source for individual customization. Note that ref2bib was developed entirely separately from r2bib and Peter King's program. It may or may not make a better job of conversion, depending on the "refer" format input supplied. It also has options to determine the naming of entries and the folding of output. The program is accessible from the Programming Research Group archive server at Oxford University by mailing a message containing: send prog ref2bib.shar to or . A message containing "help" can also be sent for more information on using the archive server. I am happy for this program to be placed on any archive server, etc., in the US and elsewhere to reduce international traffic. Jonathan Bowen Oxford University Computing Laboratory Programming Research Group England JANET: bowen@uk.ac.oxford.prg ARPA: bowen%prg.oxford.ac.uk@nss.cs.ucl.ac.uk UUCP: bowen@ox-prg.uucp (...!uunet!mcvax!ukc!ox-prg!bowen) ------------- snip snip --------------- #! /bin/sh # This is a shell archive, meaning: # 1. Remove everything above the #! /bin/sh line. # 2. Save the resulting text in a file. # 3. Execute the file with /bin/sh (not csh) to create the files: # README # Makefile # example.ref2bib # ref2bib # ref2bib.1 # This archive created: Thu Jun 23 14:22:17 1988 # By: Jonathan Bowen (Programming Research Group, Oxford University, UK) export PATH; PATH=/bin:$PATH echo shar: extracting "'README'" '(841 characters)' if test -f 'README' then echo shar: will not over-write existing file "'README'" else sed 's/^ X//' << \SHAR_EOF > 'README' XThese files relate to the ``ref2bib'' command for converting Xrefer format bibliographic database files to BibTeX format Xfiles for use with LaTeX. The files include: X X README this file X Makefile make and install files X example.ref2bib example ".ref2bib" file X ref2bib shell script which does the work X ref2bib.1 manual page X X XWritten by X Jonathan Bowen, October 1987. X Oxford University Computing Laboratory, X Programming Research Group, X 8-11 Keble Road, X Oxford OX1 3QD, X England. X Tel: +44-865-272574 (Sec: +44-865-273840) X XCopyright (C) 1987,1988 by J.P.Bowen X XPermission is granted to copy these files for Xnon-profit purposes, provided this notice is left intact. X X JANET: bowen@uk.ac.oxford.prg X ARPA: bowen%prg.oxford.ac.uk@nss.cs.ucl.ac.uk X UUCP: ...!uunet!mcvax!ukc!ox-prg!bowen X SHAR_EOF if test 841 -ne "`wc -c < 'README'`" then echo shar: error transmitting "'README'" '(should have been 841 characters)' fi fi # end of overwriting check echo shar: extracting "'Makefile'" '(329 characters)' if test -f 'Makefile' then echo shar: will not over-write existing file "'Makefile'" else sed 's/^ X//' << \SHAR_EOF > 'Makefile' XBIN=/usr/local/bin XMAN=/usr/man/man1 XSRC=README Makefile example.ref2bib ref2bib ref2bib.1 X Xinstall: X chmod +x ref2bib X cp ref2bib $(BIN)/ref2bib X cp ref2bib.1 $(MAN)/ref2bib.1 X chown root $(BIN)/ref2bib $(MAN)/ref2bib.1 X chgrp system $(BIN)/ref2bib $(MAN)/ref2bib.1 X Xshar: X shar -a $(SRC) > ref2bib.shar X compress ref2bib.shar X SHAR_EOF if test 329 -ne "`wc -c < 'Makefile'`" then echo shar: error transmitting "'Makefile'" '(should have been 329 characters)' fi fi # end of overwriting check echo shar: extracting "'example.ref2bib'" '(226 characters)' if test -f 'example.ref2bib' then echo shar: will not over-write existing file "'example.ref2bib'" else sed 's/^ X//' << \SHAR_EOF > 'example.ref2bib' Xs/Sorensen/S{\\o}rensen/g Xs/Jorgensen/J{\\o}rgensen/g Xs/Molgaard/M{\\o}lgaard/g X/^ TITLE = /s/\([{ ]\)Z\([ :\.,}]\)/\1{Z}\2/g X/^ TYPE = /s/onographs}$/onograph}/ X/^@/s/{ed\./{Ed/ Xs/^ AUTHOR = {\(.*\), ed\.},$/ AUTHOR = {\1},/ SHAR_EOF if test 226 -ne "`wc -c < 'example.ref2bib'`" then echo shar: error transmitting "'example.ref2bib'" '(should have been 226 characters)' fi fi # end of overwriting check echo shar: extracting "'ref2bib'" '(10690 characters)' if test -f 'ref2bib' then echo shar: will not over-write existing file "'ref2bib'" else sed 's/^ X//' << \SHAR_EOF > 'ref2bib' X#!/bin/sh X# X# ref2bib - convert Unix "refer" format X# to BibTeX "bib" format. X# X# Written by Jonathan Bowen, Oxford University, October 1987. X# X# Copyright (C) 1987, J.P.Bowen X# X# Permission is granted to copy this shell script for X# non-profit purposes, provided this header is left intact. X# X# JANET: bowen@uk.ac.oxford.prg X# ARPA: bowen%prg.oxford.ac.uk@nss.cs.ucl.ac.uk X# X XPATH=/bin:/usr/bin:/usr/ucb XPROGNAME=`basename $0` XDEFAULTWIDTH=72 XDEBUG=false XWIDTH=$DEFAULTWIDTH XUSENAME=false XNAMEDFILES=false XBIB=bib X Xwhile expr X$1 : X'-' > /dev/null Xdo X case "$1" in X -|-0|-w) X WIDTH= X ;; X -[1-9]|-[1-9][0-9]|-[1-9][0-9][0-9]) X WIDTH=`expr X"$1" : X'-\(.*\)'` X ;; X -a) : Name by author and year X USENAME=true X ;; X -d) : Enable debugging X DEBUG=true X ;; X -n) : Named output files X NAMEDFILES=true X ;; X -u|-U) X echo "Usage: $PROGNAME [ options ] [ file ... ] XConverts Unix \"refer\" format to \"BibTeX\" database format. X-a name entries by author and year (default=$USENAME) X-d enable debugging (default=$DEBUG) X-n output to named files (ext \".$BIB\") (default=$NAMEDFILES) X-w no maximum width X-u display usage information X-N maximum width of N characters (1-999) (default=$DEFAULTWIDTH)" X exit 0 X ;; X -*) X echo "Usage: $PROGNAME [ -a -[width] ] [ file ... ]" X exit 0 X ;; X esac X shift Xdone X XGEN=`date -u`" on "`hostname` XNAME=$BIB XEDITFILE=$HOME/.$PROGNAME XSTDIN='' XNEWFILE="" X X$DEBUG && echo "Generated: <$GEN>" 1>&2 X$DEBUG && echo "Width: <$WIDTH>" 1>&2 X X# Process each file, or if none given, standard input Xfor FILE in ${*-$STDIN} Xdo X X# First set up shell variables as required X if [ "$FILE" = "$STDIN" ] X then X NEWFILE=$NAME.$BIB X else X if [ -r "$FILE" -a -f "$FILE" ] X then X NAME=`basename $FILE` X NEWFILE=$FILE.$BIB X else X NAME="" X echo "$PROGNAME: Can't read $FILE" 1>&2 X fi X fi X X# If all is OK, read input and terminate with a blank line. X if [ "$NAME" ] X then X if [ "$FILE" = "$STDIN" ] X then X# If no files given, read from standard input. X $DEBUG && echo "Reading from standard input" 1>&2 X cat X echo X else X $DEBUG && echo "Reading <$FILE>" 1>&2 X cat $FILE X echo X fi | X X# Expand and remove trailing spaces if present Xexpand | sed 's/ *$//' | X X# Next do the real work Xawk 'BEGIN { # Initialization X gen="'"$GEN"'" X default="'"$NAME"'" X } X/^%/ { # Any % line X entry=substr($0,4) X percent=1 X } X/^%A / { # Author X if (author == "") { X name4 = substr($NF,1,4) X author = entry X } X else author = sprintf("%s and %s",author,entry) X } X/^%B / { # Book X if (booktitle == "") booktitle = entry X else booktitle = sprintf("%s %s",booktitle,entry) X entrytype = "INCOLLECTION" X } X/^%C / { # City X if (address == "") address = entry X else address = sprintf("%s %s",address,entry) X } X/^%D / { # Date X if (year == "") { X year = $NF X year2 = substr($NF,length($NF)-1) X } X if (month == "" && NF > 2) X month = substr(entry,1,length(entry)-length($NF)-1) X } X/^%E / { # Editor X if (editor == "") editor = entry X else editor = sprintf("%s and %s",editor,entry) X if (entrytype == "") entrytype = "BOOK" X } X/^%F / { # Footnote X if (note == "") note = entry X else note = sprintf("%s %s",note,entry) X } X/^%G / { # Government order number X } X/^%H / { # Header X if (annote == "") annote = entry X else annote = sprintf("%s %s",annote,entry) X } X/^%I / { # Issuer X if (institution == "") institution = entry X else institution = sprintf("%s %s",institution,entry) X } X/^%J / { # Journal X if (index(entry,"Proc.") > 0 || index(entry,"Proceedings") || \ X index(entry,"Conference") > 0) { X if (booktitle == "") booktitle = entry X else booktitle = sprintf("%s, %s",booktitle,entry) X entrytype = "INPROCEEDINGS" X } X else { X if (journal == "") journal = entry X else journal = sprintf("%s %s",journal,entry) X entrytype = "ARTICLE" X } X } X/^%K / { # Keyword X if (keywords == "") keywords = entry X else keywords = sprintf("%s, %s",keywords,entry) X } X/^%L / { # Label X if (label == "") label = $2 X } X/^%M / { # Memorandum X } X/^%N / { # Number X if (number == "") number = entry X } X/^%O / { # Other (conference) X if (conference == "") conference = entry X else conference = sprintf("%s %s",conference,entry) X entrytype = "INPROCEEDINGS" X } X/^%P / { # Page(s) X if (index(entry,"-") > 0 || index(entry,",") > 0) { X if (pages == "") pages = entry X else pages = sprintf("%s,%s",pages,entry) X } X else if (numberofpages == "") numberofpages = entry X } X/^%Q / { # Corporate or foreign author (surname first) X if (author == "") { X name4 = substr($2,1,4) X author = entry X } X else author = sprintf("%s and %s",author,entry) X } X/^%R / { # Report, paper or thesis (i.e. unpublished) X if (index(entry,"Ph") > 0) entrytype="PHDTHESIS" X if (index(entry,"hesis") > 0) entrytype="PHDTHESIS" X if (index(entry,"Sc") > 0) entrytype="MASTERSTHESIS" X if (index(entry,"aster") > 0) entrytype="MASTERSTHESIS" X if (index(entry,"ech") > 0) entrytype="TECHREPORT" X if (index(entry,"eport") > 0) entrytype="TECHREPORT" X if (index(entry,"aper") > 0) entrytype="UNPUBLISHED" X if (index(entry,"orking") > 0) entrytype="UNPUBLISHED" X if (index(entry,"npub") > 0) entrytype="UNPUBLISHED" X if (index(entry,"ocument") > 0) entrytype="MANUAL" X if (index(entry,"anual") > 0) entrytype="MANUAL" X if (entrytype=="UNPUBLISHED") { X if (note == "") note = entry X else note = sprintf("%s %s",note,entry) X } X else { X if (type == "") type = entry X } X } X/^%S / { # Series title X if (series == "") series = entry X else series = sprintf("%s %s",series,entry) X if (index(entry,"Tech") > 0) entrytype="TECHREPORT" X } X/^%T / { # Title X if (title == "") title = entry X else title = sprintf("%s %s",title,entry) X } X/^%U / { # Unused X } X/^%V / { # Volume X if (volume == "") volume = entry X } X/^%X / { # Abstract X abstract[++xcount] = entry X } X/^%Y / { # Unused X } X/^%Z / { # Can be used to supply an entry name X name = entry X } X/^[^%]/ { # Other lines - abstract if started by "%X" field X if (xcount != 0) X abstract[++xcount] = $0 X } X/^$/ { # Empty line delimits a record X if (percent == 1) { # End of a record X if (index(entrytype,"BOOK") > 0) { X publisher = institution X institution = "" X } X if (entrytype == "" || entrytype == "BOOK") { X if (pages != "") { X if (booktitle != "") entrytype = "INCOLLECTION" X else entrytype = "INBOOK" X } X } X if (entrytype == "MANUAL" || index(entrytype,"PROCEEDINGS") > 0) { X organization = institution X institution = "" X } X if (entrytype == "TECHREPORT") { X if (type == "") { X type = series X if (substr(type,length(type)) == "s") { X type = substr(type,1,length(type)-1) X } X series = "" X } X if (number == "") { X number = volume X volume = "" X } X } X if (index(entrytype,"THESIS") > 0) { X school = institution X institution = "" X } X if ("'"$USENAME"'" == "true") name = sprintf("%s%s",name4,year2) X if (name == "") name = default; X if (entrytype == "") entrytype = "MISC"; X printf "\n@%s{%s:%03d,\n",entrytype,name,++num X if (author != "") X printf "\tAUTHOR = {%s},\n",author; X if (editor != "") X printf "\tEDITOR = {%s},\n",editor; X if (title != "") X printf "\tTITLE = {%s},\n",title; X if (booktitle != "") X printf "\tBOOKTITLE = {%s},\n",booktitle; X if (journal != "") X printf "\tJOURNAL = {%s},\n",journal; X if (volume != "") X printf "\tVOLUME = {%s},\n",volume; X if (number != "") X printf "\tNUMBER = {%s},\n",number; X if (pages != "") X printf "\tPAGES = {%s},\n",pages; X if (type != "") X printf "\tTYPE = {%s},\n",type; X if (series != "") X printf "\tSERIES = {%s},\n",series; X if (institution != "") X printf "\tINSTITUTION = {%s},\n",institution; X if (organization != "") X printf "\tORGANIZATION = {%s},\n",organization; X if (school != "") X printf "\tSCHOOL = {%s},\n",school; X if (publisher != "") X printf "\tPUBLISHER = {%s},\n",publisher; X if (address != "") X printf "\tADDRESS = {%s},\n",address; X if (conference != "") X printf "\tCONFERENCE = {%s},\n",conference; X if (key != "") X printf "\tKEY = {%s},\n",key; X if (keywords != "") # Non-standard keyword field X printf "\tKEYWORDS = {%s},\n",keywords; X if (label != "") # Non-standard label field X printf "\tLABEL = {%s},\n",label; X if (numberofpages != "") # Non-standard length (in pages) field X printf "\tLENGTH = {%s},\n",numberofpages; X if (year != "") X printf "\tYEAR = {%s},\n",year; X if (month != "") X printf "\tMONTH = {%s},\n",month; X if (annote != "") X printf "\tANNOTE = {%s},\n",annote; X if (note != "") X printf "\tNOTE = {%s},\n",note; X if (xcount != 0) { X printf "\tABSTRACT = {%s",abstract[1]; X for (i=2; i<=xcount; i++) X printf "\n\t\t%s",abstract[i]; X printf "},\n" X } X printf "\tGENERATED = {%s}\n}\n",gen X } X author="" X booktitle="" X address="" X year="" X month="" X editor="" X annote="" X note="" X institution="" X organization="" X school="" X publisher="" X address="" X conference="" X journal="" X key="" X keywords="" X label="" X number="" X pages="" X numberofpages="" X type="" X series="" X title="" X volume="" X name="" X entrytype="" X name4="" X year2="" X percent=0 X xcount=0 X }' | X X# Some things can be automatically edited X sed 's/\([\\\$&#^_]\)/\\\1/g Xs/\([{ ]\)LaTeX\([} :]\)/\1{\\LaTeX}\2/g Xs/\([{ ]\)TeX\([} :]\)/\1{\\TeX}\2/g X/^ TITLE = /{ X s/\([{ ]\)\([B-Z]\)\([ :\.,}]\)/\1{\2}\3/g X s/\([{ ]\)\([A-Z][A-Z][A-Z]*\)\([ :\.,}]\)/\1{\2}\3/g X} Xs/\\0/\~/g Xs/"\([^"]*\)"/``\1'"''"'/g Xs/ - / --- /g Xs/\([0-9]\)-\([0-9]\)/\1--\2/g' | X X# Other edits can be customised by the user X if [ -r $EDITFILE -a -f $EDITFILE ] X then X sed -f $EDITFILE X else X cat X fi | X X# Optionally fold lines X if [ "$WIDTH" ] X then X awk 'BEGIN {width='"$WIDTH"'} X/^[^ ]|^$/ {printf "\n%s",$0} # Print most lines normally X/^ [^ ]/ { # Fold lines starting with a single tab X i=1 X while (i <= NF) { X if (i > 1) { X printf "\n\t\t%s",$i X pos = 16+length($i) X } X else { X printf "\n\t%s",$i X pos = 8+length($i) X } X if (++i <= NF) { X pos += 1+length($i) X while (pos <= width) { X printf " %s",$i X if (++i > NF) break X pos += 1+length($i) X } X } X } X } X/^ / { # Fold multiple lines starting with a double tab X i=1 X while (i <= NF) { X pos += 1+length($i) X if (pos > width) { X printf "\n\t\t%s",$i X pos = 16+length($i) X } X else { X printf " %s",$i X } X if (++i <= NF) { X pos += 1+length($i) X while (pos <= width) { X printf " %s",$i X if (++i > NF) break X pos += 1+length($i) X } X } X } X } XEND {printf "\n"}' X else X cat X fi | X X# Finally, output to named files or standard output X if $NAMEDFILES X then X $DEBUG && echo "Output to <$NEWFILE>" 1>&2 X cat > $NEWFILE X else X cat X fi X X fi Xdone X Xexit 0 SHAR_EOF echo shar: 1 control character may be missing from "'ref2bib'" if test 10690 -ne "`wc -c < 'ref2bib'`" then echo shar: error transmitting "'ref2bib'" '(should have been 10690 characters)' fi chmod +x 'ref2bib' fi # end of overwriting check echo shar: extracting "'ref2bib.1'" '(3078 characters)' if test -f 'ref2bib.1' then echo shar: will not over-write existing file "'ref2bib.1'" else sed 's/^ X//' << \SHAR_EOF > 'ref2bib.1' X.TH REF2BIB 1L "23 October 1987" X.UC X.SH NAME Xref2bib \- convert refer formatted records to BibTeX format X.SH SYNOPSIS X.B ref2bib X[ options ] [ file ... ] X.SH DESCRIPTION X.I Ref2bib Xreads input from a number of files or standard input. XGroups of consecutive lines starting with a ``%'' character Xare assumed to be records in \fIrefer\fP(1) format Xseparated by one or more empty lines. X(See the \fIaddbib\fP(1) manual page for further details.) XThese are translated into BibTeX database format records on Xstandard output. XOther lines are ignored and thrown away except for Xlines after an ``%X'' field in a record (see below). XDetails of the BibTeX format may be found in the XLaTeX User's Guide & Reference Manual on pp140\-147. X.PP XThe BibTeX records are given names of the form ``filename:ddd'' Xwhere ``filename'' is the basename of the current file X(or ``bib'' if reading from standard input), and ``ddd'' is Xa three digit number starting from ``001''. XThe filename may be replaced by including a (normally unused) X``%Z'' field in the refer record. X.PP XThe ``%X'' field may be used to hold an abstract. XAny lines in a record after a ``%X'' which do not start Xwith a ``%'' are considered to hold further abstract details Xand are included in the corresponding BibTeX record. X.PP XSome words and character sequences (such as ``LaTeX'') are detected Xin the input automatically and converted to a suitable LaTeX form. XThe BibTeX output may be further massaged through \fIsed\fP(1) Xby placing a \fIsed\fP script in the file \fI$HOME/.ref2bib\fP. XFor example, accents for known names may be added automatically. XE.g.: X.IP Xs/Sorensen/S{\\\\o}rensen/g X.LP XNote that backslashes need to be doubled up since they are normally Xused as an escape sequence. X.PP XThe output is normally folded at word boundaries to ensure that Xlines do not become too long on output. X.SH OPTIONS XThe following options are available: X.TP 10 X.B \-N XSpecify a maximum width for the output. XThe default is 72 characters. XIf N is omitted then lines are not folded and may be of any length. X.TP 10 X.B \-a XName the record using the first four letters of the first author's Xsurname and the last two digits of the year, rather than using Xthe filename (e.g. ``Bowe87:001''). X.TP 10 X.B \-n XUse the name of the input file(s) to produce output file(s) with Xthe same name and extension ``.bib'' rather than sending the output Xto standard output. X.TP 10 X.B \-u XDisplay the usage of the command. X.TP 10 X.B \-w XDo not fold the output. Lines may be of any length. X.SH FILES X.PD 0 X.TP 40 X$HOME/.ref2bib X\fIsed\fP script for output X.TP 40 X*.bib XBibTeX database files X.TP 40 X*.bst XBibTeX style files X.PD X.SH "SEE ALSO" Xaddbib(1), Xbibtex(1), Xfold(1), Xlatex(1), Xrefer(1), Xsed(1), X.br X``LaTeX User's Guide & Reference Manual'' by Leslie Lamport X.SH AUTHOR XJonathan Bowen, Oxford University X.SH BUGS XThis program is not perfect, needless to say, and thus Xthe output produced may require further hand editing. X.PP XThe shell script and manual page may change without notice. X.PP XPlease report problems to \fI\fP. SHAR_EOF if test 3078 -ne "`wc -c < 'ref2bib.1'`" then echo shar: error transmitting "'ref2bib.1'" '(should have been 3078 characters)' fi fi # end of overwriting check # End of shell archive exit 0 -------