Package edu.msu.cme.rdp.classifier
Class TrainingInfo
java.lang.Object
edu.msu.cme.rdp.classifier.TrainingInfo
The TrainingInfo holds all the training information and taxonomy hierarchy information.
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionCreates a new Classifier if all the train information have been completed, throws exception if not.void
createGenusWordProbList
(Reader reader) Reads in the index of the genus treenode and conditional probability that genus contains a word.void
createLogWordPriorArr
(Reader reader) Reads in the log value of the word prior probability and saves to an array LogWordPriorArr.void
createProbIndexArr
(Reader reader) Reads in start index of the conditional probability of each genus, saves to an array wordConditionalProbIndexArr.void
createTree
(Reader reader) Reads in the tree information from a reader and create all the HierarchyTrees.void
generateWordPairDiffArr
(int[] word, int beginIndex) For a given word w1 and the reverse complement word w2, calculates the difference between the log word prior of w1 and w2 and saves to an array.getGenusNodebyIndex
(int i) Returns a genus node from the genusNodeList at the specified position.int
Returns the number of the genus nodes.Returns the info of the taxonomy hierarchy from of the training file.Returns the version of the taxonomical hierarchy.float
getLogLeaveCount
(int i) Returns the log value of (number of leaves + 1) of a genusfloat
getLogWordPrior
(int wordIndex) Returns the log value of the prior probability of a word.Returns the root of the trees.int
getStartIndex
(int wordIndex) Returns the start index of GenusIndexWordConditionalProb in the array for the specified wordIndex.int
getStopIndex
(int wordIndex) Returns the stop index of GenusIndexWordConditionalProb in the array for the specified wordIndex.getWordConditionalProbObject
(int posIndex) Returns a GenusIndexWordConditionalProb from the genusIndex_wordConditionalProbList at the specified postion in the list.float
getWordPairPriorDiff
(int wordIndex) Returns the difference between given word and its reverse complement word.boolean
isSeqReversed
(int[] wordIndexArr, int wordCount) boolean
Returns true if the sequence is in reverse orientation.
-
Constructor Details
-
TrainingInfo
public TrainingInfo()Creates new TrainingInfo.
-
-
Method Details
-
createTree
Reads in the tree information from a reader and create all the HierarchyTrees. Note: the tree information has to be read after at least one of the other three files because we need to set the version information.- Throws:
IOException
TrainingDataException
-
createLogWordPriorArr
Reads in the log value of the word prior probability and saves to an array LogWordPriorArr.- Throws:
IOException
TrainingDataException
-
generateWordPairDiffArr
public void generateWordPairDiffArr(int[] word, int beginIndex) For a given word w1 and the reverse complement word w2, calculates the difference between the log word prior of w1 and w2 and saves to an array. Repeats for every possible word of size 8. -
createGenusWordProbList
Reads in the index of the genus treenode and conditional probability that genus contains a word. Saves the data into a list genus_wordConditionalProbList.- Throws:
IOException
TrainingDataException
-
createProbIndexArr
Reads in start index of the conditional probability of each genus, saves to an array wordConditionalProbIndexArr.- Throws:
IOException
TrainingDataException
-
createClassifier
Creates a new Classifier if all the train information have been completed, throws exception if not. -
getRootTree
Returns the root of the trees. -
getTrainRank
- Returns:
- the rank the classifier was trained on
-
getGenusNodeListSize
public int getGenusNodeListSize()Returns the number of the genus nodes. -
getGenusNodebyIndex
Returns a genus node from the genusNodeList at the specified position. -
getLogWordPrior
public float getLogWordPrior(int wordIndex) Returns the log value of the prior probability of a word. -
getWordPairPriorDiff
public float getWordPairPriorDiff(int wordIndex) Returns the difference between given word and its reverse complement word. -
getLogLeaveCount
public float getLogLeaveCount(int i) Returns the log value of (number of leaves + 1) of a genus -
getStartIndex
public int getStartIndex(int wordIndex) Returns the start index of GenusIndexWordConditionalProb in the array for the specified wordIndex. -
getStopIndex
public int getStopIndex(int wordIndex) Returns the stop index of GenusIndexWordConditionalProb in the array for the specified wordIndex. -
getWordConditionalProbObject
Returns a GenusIndexWordConditionalProb from the genusIndex_wordConditionalProbList at the specified postion in the list. -
getHierarchyVersion
Returns the version of the taxonomical hierarchy. -
getHierarchyInfo
Returns the info of the taxonomy hierarchy from of the training file. -
isSeqReversed
Returns true if the sequence is in reverse orientation. Sums the difference between all the overlapping words from the query sequence and the reverse complements of those word. If the summation is less that zero, the query sequence is in reverse orientation.- Throws:
IOException
-
isSeqReversed
public boolean isSeqReversed(int[] wordIndexArr, int wordCount)
-