Which of the following is not a characteristic of Fuzzy or approximate matches in regular expression?

with the more relaxed matching, there is increase of the noise level and false positives

This method is able to include more variant forms of a motif with a conserved function

the rule of matching is based on observations, not actual assumptions

with the more relaxed matching, there is increase of the noise level and false positives

the rule of matching is based on assumptions not actual observations

What does this representation mean- R.L.[EQD]?

An arginine- Leucine- Either Apartic acid, glutamic acid or glutamine

An arginine- Amino acid- Leucine- Amino acid- Either Apartic acid, glutamic acid or glutamine

An arginine- Leucine- Either Apartic acid, glutamic acid or glutamine

An arginine- Leucine- Amino acid- Either Apartic acid, glutamic acid or glutamine

An arginine- Leucine- Apartic acid and glutamic acid and glutamine

While analysing motif sequences, what is the major disadvantageous feature of PROSITE?

Lack of specificity about probability and variation and relation between them

The database constructs profiles to complement some of the sequence patterns

The functional information of these patterns is primarily based on published literature

Some of the sequence patterns are too short to be specific

Lack of specificity about probability and variation and relation between them

Which of the following is wrong in case of substitution matrices?

They determine likelihood of homology between two sequences

They use system where substitutions that are more likely should get a higher score

They use system where substitutions that are less likely should get a lower score

BLOSUM-X type uses logarithmic identity to find similarity

Which of the following does not describe PAM matrices?

It stands for Point Accepted Mutations

These matrices are used in optimal alignment scoring

It stands for Point Altered Mutations

It stands for Point Accepted Mutations

It was first developed by Margaret Dayhoff

Which of the following is untrue regarding the scoring system used in dynamic programming?

The score should be added to the diagonally positioned cell of the current cell

If the residues are same in both the sequences the match score is assumed as +5 which is added to the diagonally positioned cell of the current cell

If the residues are not same, the mismatch score is assumed as -3

If the residues are not same, the mismatch score is assumed as 3

The score should be added to the diagonally positioned cell of the current cell

Which of the following best defines regular expressions?

They are strictly restricted to alignment and corresponding score

They are made up of terms, operators and modifiers

They describe string or set of strings to find matching patterns

They are strictly restricted to alignment and corresponding score

They consist of set of rules for the connotations of various amino acid residues

While scanning for similarities in motifs, how regular expressions’ techniques work?

An algorithm similar to dynamic programming is used

It represents a sequence family by a string of characters and further compares them

An algorithm similar to dynamic programming is used

Dot matrix analysis is used in this type of sequence analysis

Matrix analysis methods are used in this type

In terminologies related to regular expressions which of the following is false about terms and operators?

Operators have precedence like arithmetic operators

Terms are strings or substrings

Operators combine terms and expressions

Operators do not have precedence

Operators have precedence like arithmetic operators

Which of the following is false in case of the database InterPro and its algorithm?

InterPro is an integrated pattern database designed to unify multiple databases for protein domains and functional sites

This database integrates information from PROSITE, Pfam, PRINTS, ProDom, and SMART databases

Only overlapping motifs and domains in a protein sequence derived by all five databases are included

All the motifs and domains in a protein sequence derived by all five databases are included

Which of the following is false in case of the CDART and its algorithm?

It stands for Conserved Domain Architecture

CDART is a domain search program that combines the results from RPS-BLAST, SMART, and Pfam

The program is now an integral part of the regular BLAST search function

CDART is a substitute for individual database searches

It stands for Conserved Domain Architecture

Which of the following is false in case of the database Pfam and its algorithm?

Each motif or domain is represented by an HMM profile generated from the seed alignment of a number of conserved homologous proteins

Since the probability scoring mechanism is more complex in HMM than in a profile-based approach the use of HMM yields further increases in sensitivity of the database matches

Pfam-B only contains sequence families not covered in Pfam

The functional annotation of motifs in Pfam-A is often related to that in UNIPROT

Which of the following is false in case of the database SMART and its algorithm?

SMART stands for Simple Modular Architecture Research Tool

Contains HMM profiles constructed from manually refined protein domain alignments

Alignments in the database are built based on tertiary structures whenever available or based on PSI-BLAST profiles

Alignments are further checked but not refined by human annotators before HMM profile construction

SMART stands for Simple Modular Architecture Research Tool

Which of the following statements about CATH-Gene3D and HAMAP databases is incorrect regarding its features?

HAMAP profiles are manually created by expert curators they identify proteins that are part of well-conserved bacterial, archaeal and plastid-encoded proteins families or subfamilies.

CATH-Gene3D describes protein families and domain architectures in complete genomes

In CATH-Gene3D the functional annotation is provided to proteins from single resource

HAMAP profiles are manually created by expert curators they identify proteins that are part of well-conserved bacterial, archaeal and plastid-encoded proteins families or subfamilies.

HAMAP stands for High-quality Automated and Manual Annotation of microbial Proteomes

Which of the following statements about PRINTS and ProDom databases is incorrect regarding its features?

ProDom domain database consists of an automatic compilation of homologous domains

PRINTS is a compendium of protein fingerprints

Usually the motifs do not overlap, but are separated along a sequence, though they may be contiguous in 3D-space

Current versions of ProDom are built using a novel procedure based on recursive BLAST searches

ProDom domain database consists of an automatic compilation of homologous domains

Which of the following statements about PANTHER and TIGRFAMs databases is incorrect regarding its features?

PANTHER is a large collection of protein families that have been subdivided into functionally related subfamilies, using human expertise

TIGRFAMs provides a tool for identifying functionally related proteins based on sequence homology

TIGRFAMs is a collection of protein families, featuring curated multiple sequence alignments, hidden Markov models (HMMs) and annotation

Hidden Markov models (HMMs) are not used in PANTHER

PANTHER is a large collection of protein families that have been subdivided into functionally related subfamilies, using human expertise

Which of the following statements about SUPERFAMILY database is incorrect regarding its features?

It searches the database using a superfamily, family, or species name plus a sequence, SCOP, PDB or HMM ID’s

Sequences can be submitted raw or FASTA format

Sequences must be submitted in FASTA format only

It searches the database using a superfamily, family, or species name plus a sequence, SCOP, PDB or HMM ID’s

It has generated GO annotations for evolutionarily closed domains and distant domains

Which of the following is not an advantage of Statistical models’ methods in analyzing protein motifs?

Sequence information is preserved from a multiple sequence alignment and expresses it with probabilistic models

Statistical models allow partial matches and compensate for unobserved sequence patterns using pseudo-counts

Statistical models have stronger predictive power than the regular expression based approach, even when they are derived from a limited set of sequences

The comparative flexibility is less in case of these methods when compared to regular expressions methods

Which of the following is not an advantageous feature or algorithm of the database PRINTS?

This program breaks down a motif into even smaller non-overlapping units called ‘fingerprints’, which are represented by unweighted PSSMs

To define a motif, at least a majority of fingerprints are required to match with a query sequence

A query that has simultaneous high-scoring matches to a majority of fingerprints belonging to a motif is a good indication of containing the functional motif

The difficulty to recognize short motifs when they reach the size of single fingerprints

Which of the following is untrue in case of the database BLOCKS?

The alignments are automatically generated using the same data sets used for deriving the BLOSUM matrices

The derived ungapped alignments are called ‘blocks’, which are usually longer than motifs, are subsequently converted to PSSMs

A weighting scheme and pseudo counts are subsequently applied to the PSSMs to account for underrepresented and unobserved residues in alignments

The functional annotation of blocks is not consistent with that for the motifs

Which of the following statements about InterPro is incorrect regarding its features?

The most closely related sequences are grouped into the lowest level clusters

Protein relatedness is defined by the P-values from the BLAST alignments

The most closely related sequences are grouped into the lowest level clusters

More distant protein groups are merged into higher levels of clusters

The outcome of this cluster merging is a tree-like structure of functional categories

Which of the following statements about COG is incorrect regarding its features?

Currently, there are 4,873 clusters in the COG databases derived from unicellular organisms

It is constructed by comparing protein sequences encoded in forty-three completely sequenced genomes, which are mainly from prokaryotes, representing thirty major phylogenetic lineages

The interface for sequence searching in the COG database is the COGnitor program, which is based on gapped BLAST

It is a protein family database based on structural classification

Which of the following statements about SCOP is incorrect regarding its features?

It aims to determine the evolutionary relationship between proteins

Proteins with the same shapes but having little sequence or functional similarity are placed in different super families, and are assumed to have only a very distant common ancestor

Proteins having the same shape and some similarity of sequence and/or function are placed in ‘families’, and are assumed to have a closer common ancestor

SCOP was created in 1994 in the Centre of Protein Engineering and the University College London

It aims to determine the evolutionary relationship between proteins

Which of the following is not a disadvantage of Needleman-Wunsch algorithm?

This method is comparatively slow

There is a need of intensive memory

This cannot be applied on genome sized sequences

This method can be applied to even large sized sequences

Which of the following is not an advantage of Needleman-Wunsch algorithm?

If you have 2 genes that are already understood as closely related, then this type of algorithm can be used to understand them in further details

New algorithmic improvements as well as increasing computer capacity make it possible to align a query sequence against a large DB in a few minutes

Similar sequence region is of same order and orientation

This does not help in determining evolutionary relationship

If you have 2 genes that are already understood as closely related, then this type of algorithm can be used to understand them in further details

Which of the following does not describe dynamic programming?

The approach compares every pair of characters in the two sequences and generates an alignment, which is the best or optimal

Global alignment algorithm is based on this method

Local alignment algorithm is based on this method

The method can be useful in aligning protein sequences to protein sequences only

Which of the following does not describe global alignment algorithm?

In initialization step, the first row and first column are subject to gap penalty

In trace back step, beginning is with the cell at the lower right of the matrix and it ends at top left cell

First row and first column are set to zero

Which of the following is false about the ‘loop’ structure in proteins?

They connect helices and sheets

They are more tolerant of mutations

They are more flexible and can adopt multiple conformations

They are never the components of active sites

Which of the following least describes Long Loop β-hairpins?

They are Often referred to as a ‘random coil’ conformation

Generally they are referred to as the β-meander supersecondary structure

Loop looks similar to the Greek Letter Ω

Wide-range of conformations with very specific sequence preferences

Which of the common structural motifs are described wrongly?

β-hairpin – adjacent antiparallel strands

Greek key – 4 adjacent antiparallel strand

β-α-β – 2 parallel strands connected by helix

β-α-β – 2 antiparallel strands connected by helix

WHICH_OF_THE_FOLLOWING_IS_UNTRUE_ABOUT_THE_PRSS_PROGRAM??$

It holds one sequence in its original form and randomizes the order of residues in the other sequence.

It stands for Probability of Random Shuffles

It is a web-based program that can be used to evaluate the statistical significance of DNA or protein sequence alignment

It first aligns two sequences using the Needleman-Wunsch algorithm and calculates the score

It holds one sequence in its original form and randomizes the order of residues in the other sequence.

What is used to generate parameters for the extreme distribution?

A single score of a shuffled sequence

The pool of alignment scores from the shuffled sequences

A single score of a shuffled sequence

The pool of alignment scores from the unshuffled sequences

The basic optimal score computed at the beginning of the test

Which of the following is a part of the statistical test of sequences?

Unrelated sequences of the different length are then generated through a randomization process

An optimal alignment between two chosen sequences is obtained at the end

Unrelated sequences of the same length are then generated through a randomization process

Unrelated sequences of the different length are then generated through a randomization process

Related sequences of the same length are then generated through a randomization process

Many studies have demonstrated that the distribution of similarity scores assumes a peculiar shape that resembles a highly skewed normal distribution with a long tail on one side. The distribution matches the _______

Gumble elective value distribution

Gumble extreme void distribution

Gumble extreme value distribution

151 + Mcqs in Statistical Significance Sequence Alignment in Bioinformatics Page 3 McqOptions

101.	Which of the following is not a characteristic of Fuzzy or approximate matches in regular expression?
A.	This method is able to include more variant forms of a motif with a conserved function
B.	the rule of matching is based on observations, not actual assumptions
C.	with the more relaxed matching, there is increase of the noise level and false positives
D.	the rule of matching is based on assumptions not actual observations
Answer» C. with the more relaxed matching, there is increase of the noise level and false positives

Discussion

102.	What does this representation mean- R.L.[EQD]?
A.	An arginine- Amino acid- Leucine- Amino acid- Either Apartic acid, glutamic acid or glutamine
B.	An arginine- Leucine- Either Apartic acid, glutamic acid or glutamine
C.	An arginine- Leucine- Amino acid- Either Apartic acid, glutamic acid or glutamine
D.	An arginine- Leucine- Apartic acid and glutamic acid and glutamine
Answer» B. An arginine- Leucine- Either Apartic acid, glutamic acid or glutamine

Discussion

103.	Emotif uses which databases for alignment of sequences?
A.	BLOCKS and PRINTS databases
B.	PROSITE
C.	BLOCKS
D.	PRINTS
Answer» B. PROSITE

Discussion

104.	While analysing motif sequences, what is the major disadvantageous feature of PROSITE?
A.	The database constructs profiles to complement some of the sequence patterns
B.	The functional information of these patterns is primarily based on published literature
C.	Some of the sequence patterns are too short to be specific
D.	Lack of specificity about probability and variation and relation between them
Answer» D. Lack of specificity about probability and variation and relation between them

Discussion

105.	Which of the following is wrong in case of substitution matrices?
A.	They determine likelihood of homology between two sequences
B.	They use system where substitutions that are more likely should get a higher score
C.	They use system where substitutions that are less likely should get a lower score
D.	BLOSUM-X type uses logarithmic identity to find similarity
Answer» E.

Discussion

106.	Which of the following does not describe PAM matrices?
A.	These matrices are used in optimal alignment scoring
B.	It stands for Point Altered Mutations
C.	It stands for Point Accepted Mutations
D.	It was first developed by Margaret Dayhoff
Answer» C. It stands for Point Accepted Mutations

Discussion

107.	Which of the following is untrue regarding the scoring system used in dynamic programming?
A.	If the residues are same in both the sequences the match score is assumed as +5 which is added to the diagonally positioned cell of the current cell
B.	If the residues are not same, the mismatch score is assumed as -3
C.	If the residues are not same, the mismatch score is assumed as 3
D.	The score should be added to the diagonally positioned cell of the current cell
Answer» D. The score should be added to the diagonally positioned cell of the current cell

Discussion

108.	Which of the following best defines regular expressions?
A.	They are made up of terms, operators and modifiers
B.	They describe string or set of strings to find matching patterns
C.	They are strictly restricted to alignment and corresponding score
D.	They consist of set of rules for the connotations of various amino acid residues
Answer» C. They are strictly restricted to alignment and corresponding score

Discussion

109.	While scanning for similarities in motifs, how regular expressions’ techniques work?
A.	It represents a sequence family by a string of characters and further compares them
B.	An algorithm similar to dynamic programming is used
C.	Dot matrix analysis is used in this type of sequence analysis
D.	Matrix analysis methods are used in this type
Answer» B. An algorithm similar to dynamic programming is used

Discussion

110.	In terminologies related to regular expressions which of the following is false about terms and operators?
A.	Terms are strings or substrings
B.	Operators combine terms and expressions
C.	Operators do not have precedence
D.	Operators have precedence like arithmetic operators
Answer» D. Operators have precedence like arithmetic operators

Discussion

111.	In regular expressions, which of the following pair of pattern is wrongly matched with its significance?
A.	‘-’ – separator
B.	< – N-terminal
C.	> – C-terminal
D.	‘>>’ – end
Answer» E.

Discussion

112.	Point out the wrong or irrelevant mathematical method in motif analysis.
A.	Enumeration
B.	Probabilistic Optimization
C.	Deterministic Optimization
D.	Literature mining
Answer» E.

Discussion

113.	Which of the following is false in case of the database InterPro and its algorithm?
A.	InterPro is an integrated pattern database designed to unify multiple databases for protein domains and functional sites
B.	This database integrates information from PROSITE, Pfam, PRINTS, ProDom, and SMART databases
C.	Only overlapping motifs and domains in a protein sequence derived by all five databases are included
D.	All the motifs and domains in a protein sequence derived by all five databases are included
Answer» E.

Discussion

114.	Which of the following is false in case of the CDART and its algorithm?
A.	CDART is a domain search program that combines the results from RPS-BLAST, SMART, and Pfam
B.	The program is now an integral part of the regular BLAST search function
C.	CDART is a substitute for individual database searches
D.	It stands for Conserved Domain Architecture
Answer» D. It stands for Conserved Domain Architecture

Discussion

115.	Which of the following is false in case of the database Pfam and its algorithm?
A.	Each motif or domain is represented by an HMM profile generated from the seed alignment of a number of conserved homologous proteins
B.	Since the probability scoring mechanism is more complex in HMM than in a profile-based approach the use of HMM yields further increases in sensitivity of the database matches
C.	Pfam-B only contains sequence families not covered in Pfam
D.	The functional annotation of motifs in Pfam-A is often related to that in UNIPROT
Answer» E.

Discussion

116.	Which of the following is false in case of the database SMART and its algorithm?
A.	Contains HMM profiles constructed from manually refined protein domain alignments
B.	Alignments in the database are built based on tertiary structures whenever available or based on PSI-BLAST profiles
C.	Alignments are further checked but not refined by human annotators before HMM profile construction
D.	SMART stands for Simple Modular Architecture Research Tool
Answer» D. SMART stands for Simple Modular Architecture Research Tool

Discussion

117.	Which of the following statements about CATH-Gene3D and HAMAP databases is incorrect regarding its features?
A.	CATH-Gene3D describes protein families and domain architectures in complete genomes
B.	In CATH-Gene3D the functional annotation is provided to proteins from single resource
C.	HAMAP profiles are manually created by expert curators they identify proteins that are part of well-conserved bacterial, archaeal and plastid-encoded proteins families or subfamilies.
D.	HAMAP stands for High-quality Automated and Manual Annotation of microbial Proteomes
Answer» C. HAMAP profiles are manually created by expert curators they identify proteins that are part of well-conserved bacterial, archaeal and plastid-encoded proteins families or subfamilies.

Discussion

118.	Which of the following statements about PRINTS and ProDom databases is incorrect regarding its features?
A.	PRINTS is a compendium of protein fingerprints
B.	Usually the motifs do not overlap, but are separated along a sequence, though they may be contiguous in 3D-space
C.	Current versions of ProDom are built using a novel procedure based on recursive BLAST searches
D.	ProDom domain database consists of an automatic compilation of homologous domains
Answer» D. ProDom domain database consists of an automatic compilation of homologous domains

Discussion

119.	Which of the following statements about PANTHER and TIGRFAMs databases is incorrect regarding its features?
A.	TIGRFAMs provides a tool for identifying functionally related proteins based on sequence homology
B.	TIGRFAMs is a collection of protein families, featuring curated multiple sequence alignments, hidden Markov models (HMMs) and annotation
C.	Hidden Markov models (HMMs) are not used in PANTHER
D.	PANTHER is a large collection of protein families that have been subdivided into functionally related subfamilies, using human expertise
Answer» D. PANTHER is a large collection of protein families that have been subdivided into functionally related subfamilies, using human expertise

Discussion

120.	What is the source of protein structures in SCOP and CATH?
A.	Uniprot
B.	Protein Data Bank
C.	Ensemble
D.	InterPro
Answer» C. Ensemble

Discussion

121.	Which of the following statements about SUPERFAMILY database is incorrect regarding its features?
A.	Sequences can be submitted raw or FASTA format
B.	Sequences must be submitted in FASTA format only
C.	It searches the database using a superfamily, family, or species name plus a sequence, SCOP, PDB or HMM ID’s
D.	It has generated GO annotations for evolutionarily closed domains and distant domains
Answer» C. It searches the database using a superfamily, family, or species name plus a sequence, SCOP, PDB or HMM ID’s

Discussion

122.	Which of the following is not an advantage of Statistical models’ methods in analyzing protein motifs?
A.	Sequence information is preserved from a multiple sequence alignment and expresses it with probabilistic models
B.	Statistical models allow partial matches and compensate for unobserved sequence patterns using pseudo-counts
C.	Statistical models have stronger predictive power than the regular expression based approach, even when they are derived from a limited set of sequences
D.	The comparative flexibility is less in case of these methods when compared to regular expressions methods
Answer» E.

Discussion

123.	Which of the following is not an advantageous feature or algorithm of the database PRINTS?
A.	This program breaks down a motif into even smaller non-overlapping units called ‘fingerprints’, which are represented by unweighted PSSMs
B.	To define a motif, at least a majority of fingerprints are required to match with a query sequence
C.	A query that has simultaneous high-scoring matches to a majority of fingerprints belonging to a motif is a good indication of containing the functional motif
D.	The difficulty to recognize short motifs when they reach the size of single fingerprints
Answer» E.

Discussion

124.	For motif scanning which of the following programs or databases is for regulated sites curated from scientific literature?
A.	ENSEMBL
B.	ORegAnno
C.	MAST
D.	Clover
Answer» C. MAST

Discussion

125.	Which of the following is untrue in case of the database BLOCKS?
A.	The alignments are automatically generated using the same data sets used for deriving the BLOSUM matrices
B.	The derived ungapped alignments are called ‘blocks’, which are usually longer than motifs, are subsequently converted to PSSMs
C.	A weighting scheme and pseudo counts are subsequently applied to the PSSMs to account for underrepresented and unobserved residues in alignments
D.	The functional annotation of blocks is not consistent with that for the motifs
Answer» E.

Discussion

126.	Which of the following statements about InterPro is incorrect regarding its features?
A.	Protein relatedness is defined by the P-values from the BLAST alignments
B.	The most closely related sequences are grouped into the lowest level clusters
C.	More distant protein groups are merged into higher levels of clusters
D.	The outcome of this cluster merging is a tree-like structure of functional categories
Answer» B. The most closely related sequences are grouped into the lowest level clusters

Discussion

127.	In which of the following multipurpose packages Gibbs sampling algorithm is used?
A.	Consensus
B.	BEST
C.	AlignACE
D.	PhyloCon
Answer» D. PhyloCon

Discussion

128.	Which of the following statements about COG is incorrect regarding its features?
A.	Currently, there are 4,873 clusters in the COG databases derived from unicellular organisms
B.	It is constructed by comparing protein sequences encoded in forty-three completely sequenced genomes, which are mainly from prokaryotes, representing thirty major phylogenetic lineages
C.	The interface for sequence searching in the COG database is the COGnitor program, which is based on gapped BLAST
D.	It is a protein family database based on structural classification
Answer» E.

Discussion

129.	Which of the following is not a member database of InterPro?
A.	SCOP
B.	HAMAP
C.	PANTHER
D.	Pfam
Answer» B. HAMAP

Discussion

130.	Pfam is available at four locations around the world. Which of the following is not one of them?
A.	UK
B.	Sweden
C.	US
D.	Japan
Answer» E.

Discussion

131.	Which of the following statements about SCOP is incorrect regarding its features?
A.	Proteins with the same shapes but having little sequence or functional similarity are placed in different super families, and are assumed to have only a very distant common ancestor
B.	Proteins having the same shape and some similarity of sequence and/or function are placed in ‘families’, and are assumed to have a closer common ancestor
C.	SCOP was created in 1994 in the Centre of Protein Engineering and the University College London
D.	It aims to determine the evolutionary relationship between proteins
Answer» D. It aims to determine the evolutionary relationship between proteins

Discussion

132.	When did Needleman-Wunsch first describe the algorithm for global alignment?
A.	1899
B.	1970
C.	1930
D.	1950
Answer» C. 1930

Discussion

133.	Which of the following is not a disadvantage of Needleman-Wunsch algorithm?
A.	This method is comparatively slow
B.	There is a need of intensive memory
C.	This cannot be applied on genome sized sequences
D.	This method can be applied to even large sized sequences
Answer» E.

Discussion

134.	Which of the following is not an advantage of Needleman-Wunsch algorithm?
A.	New algorithmic improvements as well as increasing computer capacity make it possible to align a query sequence against a large DB in a few minutes
B.	Similar sequence region is of same order and orientation
C.	This does not help in determining evolutionary relationship
D.	If you have 2 genes that are already understood as closely related, then this type of algorithm can be used to understand them in further details
Answer» D. If you have 2 genes that are already understood as closely related, then this type of algorithm can be used to understand them in further details

Discussion

135.	Which of the following does not describe dynamic programming?
A.	The approach compares every pair of characters in the two sequences and generates an alignment, which is the best or optimal
B.	Global alignment algorithm is based on this method
C.	Local alignment algorithm is based on this method
D.	The method can be useful in aligning protein sequences to protein sequences only
Answer» E.

Discussion

136.	Which of the following does not describe global alignment algorithm?
A.	In initialization step, the first row and first column are subject to gap penalty
B.	Score can be negative
C.	In trace back step, beginning is with the cell at the lower right of the matrix and it ends at top left cell
D.	First row and first column are set to zero
Answer» E.

Discussion

137.	What is the length of a motif, in terms of amino acids residue?
A.	30- 60
B.	10- 20
C.	70- 90
D.	1- 10
Answer» C. 70- 90

Discussion

138.	Which of the following is false about the ‘loop’ structure in proteins?
A.	They connect helices and sheets
B.	They are more tolerant of mutations
C.	They are more flexible and can adopt multiple conformations
D.	They are never the components of active sites
Answer» E.

Discussion

139.	On average, what is the length of a typical domain?
A.	About 100 residues
B.	About 300 residues
C.	About 500 residues
D.	About 900 residues
Answer» B. About 300 residues

Discussion

140.	Which of the following least describes Long Loop β-hairpins?
A.	They are Often referred to as a ‘random coil’ conformation
B.	Generally they are referred to as the β-meander supersecondary structure
C.	Loop looks similar to the Greek Letter Ω
D.	Wide-range of conformations with very specific sequence preferences
Answer» E.

Discussion

141.	Which of the common structural motifs are described wrongly?
A.	β-hairpin – adjacent antiparallel strands
B.	Greek key – 4 adjacent antiparallel strand
C.	β-α-β – 2 parallel strands connected by helix
D.	β-α-β – 2 antiparallel strands connected by helix
Answer» E.

Discussion

142.	WHICH_OF_THE_FOLLOWING_IS_UNTRUE_ABOUT_THE_PRSS_PROGRAM??$
A.	It stands for Probability of Random Shuffles
B.	It is a web-based program that can be used to evaluate the statistical significance of DNA or protein sequence alignment
C.	It first aligns two sequences using the Needleman-Wunsch algorithm and calculates the score
D.	It holds one sequence in its original form and randomizes the order of residues in the other sequence.
Answer» D. It holds one sequence in its original form and randomizes the order of residues in the other sequence.

Discussion

143.	The_major_disadvantage_of_the_PRSS_program_is_that_it_doesn‚Äö√Ñ√∂‚àö√ë‚àö¬•t_allow_partial_shuffling.$#
A.	True
B.	False
Answer» C.

Discussion

144.	It is not known whether the Gumble distribution applies equally well to gapped alignments?
A.	True
B.	False
Answer» B. False

Discussion

145.	If the score is located in the extreme margin of the distribution, that means that the alignment between the two sequences is ______ due to random chance and is thus considered ______
A.	unlikely, significant
B.	unlikely, insignificant
C.	unlikely, insignificant
D.	very likely, significant
Answer» B. unlikely, insignificant

Discussion

146.	What is used to generate parameters for the extreme distribution?
A.	The pool of alignment scores from the shuffled sequences
B.	A single score of a shuffled sequence
C.	The pool of alignment scores from the unshuffled sequences
D.	The basic optimal score computed at the beginning of the test
Answer» B. A single score of a shuffled sequence

Discussion

147.	In the statistical test, randomization process in which one of the two given sequences is randomly shuffled.
A.	True
B.	False
Answer» B. False

Discussion

148.	Which of the following is a part of the statistical test of sequences?
A.	An optimal alignment between two chosen sequences is obtained at the end
B.	Unrelated sequences of the same length are then generated through a randomization process
C.	Unrelated sequences of the different length are then generated through a randomization process
D.	Related sequences of the same length are then generated through a randomization process
Answer» C. Unrelated sequences of the different length are then generated through a randomization process

Discussion

149.	Many studies have demonstrated that the distribution of similarity scores assumes a peculiar shape that resembles a highly skewed normal distribution with a long tail on one side. The distribution matches the _______
A.	Gumble elective value distribution
B.	Gumble extreme void distribution
C.	Gumble end value distribution
D.	Gumble extreme value distribution
Answer» E.

Discussion

150.	By calculating alignment scores of a large number of ______ sequence pairs, a distribution model of the ______ sequence scores can be derived.
A.	related, randomized
B.	unrelated, randomized
C.	unrelated, unrandomized
D.	related, unrandomized
Answer» C. unrelated, unrandomized

Discussion

Explore topic-wise MCQs in Bioinformatics.

Which of the following is not a characteristic of Fuzzy or approximate matches in regular expression?

What does this representation mean- R.L.[EQD]?

Emotif uses which databases for alignment of sequences?

While analysing motif sequences, what is the major disadvantageous feature of PROSITE?

Which of the following is wrong in case of substitution matrices?

Which of the following does not describe PAM matrices?

Which of the following is untrue regarding the scoring system used in dynamic programming?

Which of the following best defines regular expressions?

While scanning for similarities in motifs, how regular expressions’ techniques work?

In terminologies related to regular expressions which of the following is false about terms and operators?

In regular expressions, which of the following pair of pattern is wrongly matched with its significance?

Point out the wrong or irrelevant mathematical method in motif analysis.

Which of the following is false in case of the database InterPro and its algorithm?

Which of the following is false in case of the CDART and its algorithm?

Which of the following is false in case of the database Pfam and its algorithm?

Which of the following is false in case of the database SMART and its algorithm?

Which of the following statements about CATH-Gene3D and HAMAP databases is incorrect regarding its features?

Which of the following statements about PRINTS and ProDom databases is incorrect regarding its features?

Which of the following statements about PANTHER and TIGRFAMs databases is incorrect regarding its features?

What is the source of protein structures in SCOP and CATH?

Which of the following statements about SUPERFAMILY database is incorrect regarding its features?

Which of the following is not an advantage of Statistical models’ methods in analyzing protein motifs?

Which of the following is not an advantageous feature or algorithm of the database PRINTS?

For motif scanning which of the following programs or databases is for regulated sites curated from scientific literature?

Which of the following is untrue in case of the database BLOCKS?

Which of the following statements about InterPro is incorrect regarding its features?

In which of the following multipurpose packages Gibbs sampling algorithm is used?

Which of the following statements about COG is incorrect regarding its features?

Which of the following is not a member database of InterPro?

Pfam is available at four locations around the world. Which of the following is not one of them?

Which of the following statements about SCOP is incorrect regarding its features?

When did Needleman-Wunsch first describe the algorithm for global alignment?

Which of the following is not a disadvantage of Needleman-Wunsch algorithm?

Which of the following is not an advantage of Needleman-Wunsch algorithm?

Which of the following does not describe dynamic programming?

Which of the following does not describe global alignment algorithm?

What is the length of a motif, in terms of amino acids residue?

Which of the following is false about the ‘loop’ structure in proteins?

On average, what is the length of a typical domain?

Which of the following least describes Long Loop β-hairpins?

Which of the common structural motifs are described wrongly?

WHICH_OF_THE_FOLLOWING_IS_UNTRUE_ABOUT_THE_PRSS_PROGRAM??$

The_major_disadvantage_of_the_PRSS_program_is_that_it_doesn‚Äö√Ñ√∂‚àö√ë‚àö¬•t_allow_partial_shuffling.$#

It is not known whether the Gumble distribution applies equally well to gapped alignments?

If the score is located in the extreme margin of the distribution, that means that the alignment between the two sequences is ______ due to random chance and is thus considered ______

What is used to generate parameters for the extreme distribution?

In the statistical test, randomization process in which one of the two given sequences is randomly shuffled.

Which of the following is a part of the statistical test of sequences?

Many studies have demonstrated that the distribution of similarity scores assumes a peculiar shape that resembles a highly skewed normal distribution with a long tail on one side. The distribution matches the _______

By calculating alignment scores of a large number of ______ sequence pairs, a distribution model of the ______ sequence scores can be derived.

If the score is located in the extreme margin of the distribution, that means that the alignment between the two sequences is due to random chance and is thus considered

By calculating alignment scores of a large number of sequence pairs, a distribution model of the sequence scores can be derived.