SGD ORF Additions based on SAGE

The SAGE data provided by Velculescu et al., (1997) Cell 88:243-251 was used at SGD to identify 27 new ORFs using the following strategy:
  1. The coordinates of the tag sequences along the genome were determined and each tag was classified into one of these four categories: 1) class 1 - within an existing ORF, 2) class 2 - within 500 bp downstream of existing an ORF, 3) class 4 - opposite of an existing ORF, or 4) class 3 - none of the above.

  2. The regions between two existing ORFs which contained one or more unique class 3 tags (number 4) above) were examined for potential coding sequences in which the unique tag was located either within the coding sequence or 500bp downstream of this sequence. BLASTP analysis was then performed for each potential ORF meeting these criteria against the non-redundant (nr) NCBI dataset, and those with a P value exponent of -6 or less were analyzed further.

  3. The BLAST results were analyzed on an individual basis for each potential ORF meeting the above criteria. Those potential ORFs which exhibited reasonable homology to other proteins, and did not appear to be matched with other proteins based on homology to repetetive sequences alone, were identified. They constitute the new ORFs found in the table below. These ORFs have now been entered into SGD.

New ORF (click for link to the ORF's Gene/Sequence Resources)
Description based on BLAST hits
Chromosomal SAGE Map for ORF
Identifying Tag Information (including expression levels)
YBL091C-A Similar to D. melanogaster inturned protein SAGE Map CATGTTTGATTTGA
YBL107W-A Similar to TyA and TyB proteins SAGE Map CATGGCGTCCTTTG
YCR018C-A Similar to probable membrane protein YLR334C and ORF YOL106W SAGE Map CATGAGAATCACGT
YCR102W-A Similar to several yeast probable membrane proteins, including YNR075W and YFL062W SAGE Map CATGGAGTACCAAC
YDL130W-A ATPase stabilizing factor, mitochondrial precursor SAGE Map CATGCCAAATCAAA
YDR034C-A Similar to probable membrane protein YLR334C and ORF YOL106W SAGE Map CATGCTAATATTAC
YDR034W-B Similar to probable membrane protein YDR210W and others; similar to YBR016W SAGE Map CATGTGTTTATAAG
YDR363W-A/HOD1 Similar to hypothetical protein from S. pombe SAGE Map CATGGGCCAATGGT
YDR525W-A
SAGE Map CATGAGTGACTCTT
YER048W-A Similar to D. melanogaster protein; weaker homology to C. elegans protein with similarity to human cdk7/cyclin H assembly factor SAGE Map CATGGGACTATAAG
YER091C-A
SAGE Map CATGGCAGCTCTTT
YER138W-A Simlar to TyB and TyA SAGE Map CATGGATGCCGAAA
YGR122C-A Similar to probable membrane protein YLR334C and ORF YOL106W SAGE Map CATGTGTATATTTT
YIR020W-B
SAGE Map CATGTGGTGGAAAT
YKL033W-A Similar to S. pombe hypothetical proteins SAGE Map CATGCTGTTTTGGG
YKL053C-A Similar to human sequence predicted by GENSCAN SAGE Map CATGTAAATACTAA
YKL162C-A Similar to PIR1, PIR2 and PIR3 proteins SAGE Map CATGTAAATCTGAG
YLL018C-A Simlar to hypothetical S. pombe sequence SAGE Map CATGCAAGAGTATC
YLR262C-A Similar to C. elegans protein SAGE Map CATGTCTAGTCGCC
YML081C-A Similar to S. pombe hypothetical protein SAGE Map CATGTTGAAAAGAT
YMR046W-A Similar to probable membrane protein YLR334C and ORF YOL106W SAGE Map CATGTTTTTCTTAA
YMR158C-B Similar to probable membrane protein YDR340W and to yeast CYC1/CYP3 transcription activator SAGE Map CATGGTCCGTATAT
YMR194C-A
SAGE Map CATGCCAGAAGGAG
YNR032C-A Similar to ubiquitin-like protein 8 of Arabidopsis thaliana and C. elegans SAGE Map CATGATCAGACAAA
YOL013W-A Similar to probable membrane protein YLR334C and ORF YOL106W SAGE Map CATGTGTACGCATT
YOR298C-A/MBF1 Similar to multiprotein bridging factor 1 of Bombyx mori SAGE Map CATGTCATCAATGA
YPR002C-A Similar to probable membrane protein YLR334C and ORF YOL106W SAGE Map CATGAATTGACGAA