Are ‘UniProt’ records complying with the DFSG ?
Hello everybody,
I just realised that in a package that I maintain, emboss, the file
‘test/data/uniprotft.sw’ has a non-free license:
Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms
Distributed under the Creative Commons Attribution-NoDerivs License
However, the page at the URL above points at Creative Commons' FAQ about
databases, which suggests that the file's contents are actually not
copyrightable.
http://sciencecommons.org/resources/faq/databases/
http://sciencecommons.org/resources/faq/databases/#dbcopyright
Here are extracts of the file:
ID ANXA5_HUMAN Reviewed; 320 AA.
AC P08758; Q6FI16; Q8WV69;
DT 01-NOV-1988, integrated into UniProtKB/Swiss-Prot.
DT 23-JAN-2007, sequence version 2.
DT 10-FEB-2009, entry version 116.
DE RecName: Full=Annexin A5;
DE AltName: Full=Annexin-5;
DE AltName: Full=Annexin V;
DE AltName: Full=Lipocortin V;
DE AltName: Full=Endonexin II;
DE AltName: Full=Calphobindin I;
DE Short=CBP-I;
DE AltName: Full=Placental anticoagulant protein I;
DE Short=PAP-I;
DE AltName: Full=Placental anticoagulant protein 4;
DE Short=PP4;
DE AltName: Full=Thromboplastin inhibitor;
DE AltName: Full=Vascular anticoagulant-alpha;
DE Short=VAC-alpha;
DE AltName: Full=Anchorin CII;
GN Name=ANXA5; Synonyms=ANX5, ENX2, PP4;
OS Homo sapiens (Human).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
OC Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini;
OC Catarrhini; Hominidae; Homo.
OX NCBI_TaxID=9606;
RN [1]
RP NUCLEOTIDE SEQUENCE [MRNA].
RX MEDLINE=88163463; PubMed=2964863; DOI=10.1021/bi00399a011;
RA Funakoshi T., Hendrickson L.E., McMullen B.A., Fujikawa K.;
RT "Primary structure of human placental anticoagulant protein.";
RL Biochemistry 26:8087-8092(1987).
…
RP X-RAY CRYSTALLOGRAPHY (2.3 ANGSTROMS).
RX MEDLINE=98118533; PubMed=9435213; DOI=10.1073/pnas.95.2.455;
RA Budisa N., Minks C., Medrano F.J., Lutz J., Huber R., Moroder L.;
RT "Residue-specific bioincorporation of non-natural, biologically active
RT amino acids into proteins as possible drug carriers: structure and
RT stability of the per-thiaproline mutant of annexin V.";
RL Proc. Natl. Acad. Sci. U.S.A. 95:455-459(1998).
CC -!- FUNCTION: This protein is an anticoagulant protein that acts as an
CC indirect inhibitor of the thromboplastin-specific complex, which
CC is involved in the blood coagulation cascade.
CC -!- SUBUNIT: Monomer. Binds ATRX and EIF5B (By similarity).
CC -!- INTERACTION:
CC P70489:Abp10 (xeno); NbExp=1; IntAct=EBI-296601, EBI-78367;
CC P70486:Atrx (xeno); NbExp=1; IntAct=EBI-296601, EBI-78333;
CC Q9Z330:Dnmt1 (xeno); NbExp=1; IntAct=EBI-296601, EBI-78342;
CC P70488:Eif5b (xeno); NbExp=1; IntAct=EBI-296601, EBI-78359;
CC -!- DOMAIN: A pair of annexin repeats may form one binding site for
CC calcium and phospholipid.
CC -!- SIMILARITY: Belongs to the annexin family.
CC -!- SIMILARITY: Contains 4 annexin repeats.
CC -!- CAUTION: This protein has been independently sequenced by at least
CC seven groups under different names.
CC -!- CAUTION: Ref.9 sequence was thought to originate from mouse.
CC -!- WEB RESOURCE: Name=R&D Systems' cytokine source book: Annexin V;
CC URL="http://www.rndsystems.com/molecule_detail.aspx?m=1063";
CC -----------------------------------------------------------------------
CC Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms
CC Distributed under the Creative Commons Attribution-NoDerivs License
CC -----------------------------------------------------------------------
DR EMBL; M18366; AAA35570.1; -; mRNA.
DR EMBL; D00172; BAA00122.1; -; mRNA.
DR EMBL; X12454; CAA30985.1; -; mRNA.
DR EMBL; J03745; AAA52386.1; -; mRNA.
DR EMBL; M21731; AAA36166.1; -; mRNA.
DR EMBL; M19384; AAB59545.1; -; mRNA.
DR EMBL; U01691; AAB40047.1; -; Genomic_DNA.
DR EMBL; U01681; AAB40047.1; JOINED; Genomic_DNA.
DR EMBL; U01682; AAB40047.1; JOINED; Genomic_DNA.
…
FT HELIX 221 224
FT TURN 227 229
FT HELIX 232 245
FT HELIX 247 257
FT STRAND 260 263
FT HELIX 266 275
FT TURN 276 280
FT HELIX 281 292
FT HELIX 296 303
FT HELIX 306 316
SQ SEQUENCE 320 AA; 35937 MW; 45E14E3964BA4D1A CRC64;
MAQVLRGTVT DFPGFDERAD AETLRKAMKG LGTDEESILT LLTSRSNAQR QEISAAFKTL
FGRDLLDDLK SELTGKFEKL IVALMKPSRL YDAYELKHAL KGAGTNEKVL TEIIASRTPE
ELRAIKQVYE EEYGSSLEDD VVGDTSGYYQ RMLVVLLQAN RDPDAGIDEA QVEQDAQALF
QAGELKWGTD EEKFITIFGT RSVSHLRKVF DKYMTISGFQ IEETIDRETS GNLEQLLLAV
VKSIRSIPAY LAETLYYAMK GAGTDDHTLI RVMVSRSEID LFNIRKEFRK NFATSLYSMI
KGDTSGDYKK ALLLLCGEDD
//
ID CSF3_HUMAN Reviewed; 207 AA.
AC P09919;
DT 01-JUL-1989, integrated into UniProtKB/Swiss-Prot.
DT 01-JUL-1989, sequence version 1.
DT 10-FEB-2009, entry version 109.
DE RecName: Full=Granulocyte colony-stimulating factor;
DE Short=G-CSF;
DE AltName: Full=Pluripoietin;
DE AltName: INN=Filgrastim;
DE AltName: INN=Lenograstim;
DE Flags: Precursor;
GN Name=CSF3; Synonyms=GCSF;
OS Homo sapiens (Human).
Point (iv) of CC's FAQ notes that:
“The data: – whether the data itself is copyrightable, depends on what it is.
To the extent it consists of factual information, it will not be copyrightable.
For example, the contents of NCBI’s Entrez Gene database include gene names,
descriptions, pathways, protein products, and other facts. However, to the
extent the data is creative and expressive works, such as papers or
photographs, then the database content itself is likely to be protected by
copyright. Even if copyright protection extends to a paper or photograph
contained in a database, that copyright will not extend to the information and
ideas expressed in these materials.”
To me, the contents of the records above look factual. Can I conclude that,
being non-copyrightable, the file is not non-free despite its license
statement ?
Have a nice day,
--
Charles Plessy
Debian Med packaging team,
http://www.debian.org/devel/debian-med
Tsurumi, Kanagawa, Japan
Reply to: