Explore topic-wise MCQs in Bioinformatics.

This section includes 151 Mcqs, each offering curated multiple-choice questions to sharpen your Bioinformatics knowledge and support exam preparation. Choose a topic below to get started.

101.

Which of the following is not a characteristic of Fuzzy or approximate matches in regular expression?

A. This method is able to include more variant forms of a motif with a conserved function
B. the rule of matching is based on observations, not actual assumptions
C. with the more relaxed matching, there is increase of the noise level and false positives
D. the rule of matching is based on assumptions not actual observations
Answer» C. with the more relaxed matching, there is increase of the noise level and false positives
102.

What does this representation mean- R.L.[EQD]?

A. An arginine- Amino acid- Leucine- Amino acid- Either Apartic acid, glutamic acid or glutamine
B. An arginine- Leucine- Either Apartic acid, glutamic acid or glutamine
C. An arginine- Leucine- Amino acid- Either Apartic acid, glutamic acid or glutamine
D. An arginine- Leucine- Apartic acid and glutamic acid and glutamine
Answer» B. An arginine- Leucine- Either Apartic acid, glutamic acid or glutamine
103.

Emotif uses which databases for alignment of sequences?

A. BLOCKS and PRINTS databases
B. PROSITE
C. BLOCKS
D. PRINTS
Answer» B. PROSITE
104.

While analysing motif sequences, what is the major disadvantageous feature of PROSITE?

A. The database constructs profiles to complement some of the sequence patterns
B. The functional information of these patterns is primarily based on published literature
C. Some of the sequence patterns are too short to be specific
D. Lack of specificity about probability and variation and relation between them
Answer» D. Lack of specificity about probability and variation and relation between them
105.

Which of the following is wrong in case of substitution matrices?

A. They determine likelihood of homology between two sequences
B. They use system where substitutions that are more likely should get a higher score
C. They use system where substitutions that are less likely should get a lower score
D. BLOSUM-X type uses logarithmic identity to find similarity
Answer» E.
106.

Which of the following does not describe PAM matrices?

A. These matrices are used in optimal alignment scoring
B. It stands for Point Altered Mutations
C. It stands for Point Accepted Mutations
D. It was first developed by Margaret Dayhoff
Answer» C. It stands for Point Accepted Mutations
107.

Which of the following is untrue regarding the scoring system used in dynamic programming?

A. If the residues are same in both the sequences the match score is assumed as +5 which is added to the diagonally positioned cell of the current cell
B. If the residues are not same, the mismatch score is assumed as -3
C. If the residues are not same, the mismatch score is assumed as 3
D. The score should be added to the diagonally positioned cell of the current cell
Answer» D. The score should be added to the diagonally positioned cell of the current cell
108.

Which of the following best defines regular expressions?

A. They are made up of terms, operators and modifiers
B. They describe string or set of strings to find matching patterns
C. They are strictly restricted to alignment and corresponding score
D. They consist of set of rules for the connotations of various amino acid residues
Answer» C. They are strictly restricted to alignment and corresponding score
109.

While scanning for similarities in motifs, how regular expressions’ techniques work?

A. It represents a sequence family by a string of characters and further compares them
B. An algorithm similar to dynamic programming is used
C. Dot matrix analysis is used in this type of sequence analysis
D. Matrix analysis methods are used in this type
Answer» B. An algorithm similar to dynamic programming is used
110.

In terminologies related to regular expressions which of the following is false about terms and operators?

A. Terms are strings or substrings
B. Operators combine terms and expressions
C. Operators do not have precedence
D. Operators have precedence like arithmetic operators
Answer» D. Operators have precedence like arithmetic operators
111.

In regular expressions, which of the following pair of pattern is wrongly matched with its significance?

A. ‘-’ – separator
B. < – N-terminal
C. > – C-terminal
D. ‘>>’ – end
Answer» E.
112.

Point out the wrong or irrelevant mathematical method in motif analysis.

A. Enumeration
B. Probabilistic Optimization
C. Deterministic Optimization
D. Literature mining
Answer» E.
113.

Which of the following is false in case of the database InterPro and its algorithm?

A. InterPro is an integrated pattern database designed to unify multiple databases for protein domains and functional sites
B. This database integrates information from PROSITE, Pfam, PRINTS, ProDom, and SMART databases
C. Only overlapping motifs and domains in a protein sequence derived by all five databases are included
D. All the motifs and domains in a protein sequence derived by all five databases are included
Answer» E.
114.

Which of the following is false in case of the CDART and its algorithm?

A. CDART is a domain search program that combines the results from RPS-BLAST, SMART, and Pfam
B. The program is now an integral part of the regular BLAST search function
C. CDART is a substitute for individual database searches
D. It stands for Conserved Domain Architecture
Answer» D. It stands for Conserved Domain Architecture
115.

Which of the following is false in case of the database Pfam and its algorithm?

A. Each motif or domain is represented by an HMM profile generated from the seed alignment of a number of conserved homologous proteins
B. Since the probability scoring mechanism is more complex in HMM than in a profile-based approach the use of HMM yields further increases in sensitivity of the database matches
C. Pfam-B only contains sequence families not covered in Pfam
D. The functional annotation of motifs in Pfam-A is often related to that in UNIPROT
Answer» E.
116.

Which of the following is false in case of the database SMART and its algorithm?

A. Contains HMM profiles constructed from manually refined protein domain alignments
B. Alignments in the database are built based on tertiary structures whenever available or based on PSI-BLAST profiles
C. Alignments are further checked but not refined by human annotators before HMM profile construction
D. SMART stands for Simple Modular Architecture Research Tool
Answer» D. SMART stands for Simple Modular Architecture Research Tool
117.

Which of the following statements about CATH-Gene3D and HAMAP databases is incorrect regarding its features?

A. CATH-Gene3D describes protein families and domain architectures in complete genomes
B. In CATH-Gene3D the functional annotation is provided to proteins from single resource
C. HAMAP profiles are manually created by expert curators they identify proteins that are part of well-conserved bacterial, archaeal and plastid-encoded proteins families or subfamilies.
D. HAMAP stands for High-quality Automated and Manual Annotation of microbial Proteomes
Answer» C. HAMAP profiles are manually created by expert curators they identify proteins that are part of well-conserved bacterial, archaeal and plastid-encoded proteins families or subfamilies.
118.

Which of the following statements about PRINTS and ProDom databases is incorrect regarding its features?

A. PRINTS is a compendium of protein fingerprints
B. Usually the motifs do not overlap, but are separated along a sequence, though they may be contiguous in 3D-space
C. Current versions of ProDom are built using a novel procedure based on recursive BLAST searches
D. ProDom domain database consists of an automatic compilation of homologous domains
Answer» D. ProDom domain database consists of an automatic compilation of homologous domains
119.

Which of the following statements about PANTHER and TIGRFAMs databases is incorrect regarding its features?

A. TIGRFAMs provides a tool for identifying functionally related proteins based on sequence homology
B. TIGRFAMs is a collection of protein families, featuring curated multiple sequence alignments, hidden Markov models (HMMs) and annotation
C. Hidden Markov models (HMMs) are not used in PANTHER
D. PANTHER is a large collection of protein families that have been subdivided into functionally related subfamilies, using human expertise
Answer» D. PANTHER is a large collection of protein families that have been subdivided into functionally related subfamilies, using human expertise
120.

What is the source of protein structures in SCOP and CATH?

A. Uniprot
B. Protein Data Bank
C. Ensemble
D. InterPro
Answer» C. Ensemble
121.

Which of the following statements about SUPERFAMILY database is incorrect regarding its features?

A. Sequences can be submitted raw or FASTA format
B. Sequences must be submitted in FASTA format only
C. It searches the database using a superfamily, family, or species name plus a sequence, SCOP, PDB or HMM ID’s
D. It has generated GO annotations for evolutionarily closed domains and distant domains
Answer» C. It searches the database using a superfamily, family, or species name plus a sequence, SCOP, PDB or HMM ID’s
122.

Which of the following is not an advantage of Statistical models’ methods in analyzing protein motifs?

A. Sequence information is preserved from a multiple sequence alignment and expresses it with probabilistic models
B. Statistical models allow partial matches and compensate for unobserved sequence patterns using pseudo-counts
C. Statistical models have stronger predictive power than the regular expression based approach, even when they are derived from a limited set of sequences
D. The comparative flexibility is less in case of these methods when compared to regular expressions methods
Answer» E.
123.

Which of the following is not an advantageous feature or algorithm of the database PRINTS?

A. This program breaks down a motif into even smaller non-overlapping units called ‘fingerprints’, which are represented by unweighted PSSMs
B. To define a motif, at least a majority of fingerprints are required to match with a query sequence
C. A query that has simultaneous high-scoring matches to a majority of fingerprints belonging to a motif is a good indication of containing the functional motif
D. The difficulty to recognize short motifs when they reach the size of single fingerprints
Answer» E.
124.

For motif scanning which of the following programs or databases is for regulated sites curated from scientific literature?

A. ENSEMBL
B. ORegAnno
C. MAST
D. Clover
Answer» C. MAST
125.

Which of the following is untrue in case of the database BLOCKS?

A. The alignments are automatically generated using the same data sets used for deriving the BLOSUM matrices
B. The derived ungapped alignments are called ‘blocks’, which are usually longer than motifs, are subsequently converted to PSSMs
C. A weighting scheme and pseudo counts are subsequently applied to the PSSMs to account for underrepresented and unobserved residues in alignments
D. The functional annotation of blocks is not consistent with that for the motifs
Answer» E.
126.

Which of the following statements about InterPro is incorrect regarding its features?

A. Protein relatedness is defined by the P-values from the BLAST alignments
B. The most closely related sequences are grouped into the lowest level clusters
C. More distant protein groups are merged into higher levels of clusters
D. The outcome of this cluster merging is a tree-like structure of functional categories
Answer» B. The most closely related sequences are grouped into the lowest level clusters
127.

In which of the following multipurpose packages Gibbs sampling algorithm is used?

A. Consensus
B. BEST
C. AlignACE
D. PhyloCon
Answer» D. PhyloCon
128.

Which of the following statements about COG is incorrect regarding its features?

A. Currently, there are 4,873 clusters in the COG databases derived from unicellular organisms
B. It is constructed by comparing protein sequences encoded in forty-three completely sequenced genomes, which are mainly from prokaryotes, representing thirty major phylogenetic lineages
C. The interface for sequence searching in the COG database is the COGnitor program, which is based on gapped BLAST
D. It is a protein family database based on structural classification
Answer» E.
129.

Which of the following is not a member database of InterPro?

A. SCOP
B. HAMAP
C. PANTHER
D. Pfam
Answer» B. HAMAP
130.

Pfam is available at four locations around the world. Which of the following is not one of them?

A. UK
B. Sweden
C. US
D. Japan
Answer» E.
131.

Which of the following statements about SCOP is incorrect regarding its features?

A. Proteins with the same shapes but having little sequence or functional similarity are placed in different super families, and are assumed to have only a very distant common ancestor
B. Proteins having the same shape and some similarity of sequence and/or function are placed in ‘families’, and are assumed to have a closer common ancestor
C. SCOP was created in 1994 in the Centre of Protein Engineering and the University College London
D. It aims to determine the evolutionary relationship between proteins
Answer» D. It aims to determine the evolutionary relationship between proteins
132.

When did Needleman-Wunsch first describe the algorithm for global alignment?

A. 1899
B. 1970
C. 1930
D. 1950
Answer» C. 1930
133.

Which of the following is not a disadvantage of Needleman-Wunsch algorithm?

A. This method is comparatively slow
B. There is a need of intensive memory
C. This cannot be applied on genome sized sequences
D. This method can be applied to even large sized sequences
Answer» E.
134.

Which of the following is not an advantage of Needleman-Wunsch algorithm?

A. New algorithmic improvements as well as increasing computer capacity make it possible to align a query sequence against a large DB in a few minutes
B. Similar sequence region is of same order and orientation
C. This does not help in determining evolutionary relationship
D. If you have 2 genes that are already understood as closely related, then this type of algorithm can be used to understand them in further details
Answer» D. If you have 2 genes that are already understood as closely related, then this type of algorithm can be used to understand them in further details
135.

Which of the following does not describe dynamic programming?

A. The approach compares every pair of characters in the two sequences and generates an alignment, which is the best or optimal
B. Global alignment algorithm is based on this method
C. Local alignment algorithm is based on this method
D. The method can be useful in aligning protein sequences to protein sequences only
Answer» E.
136.

Which of the following does not describe global alignment algorithm?

A. In initialization step, the first row and first column are subject to gap penalty
B. Score can be negative
C. In trace back step, beginning is with the cell at the lower right of the matrix and it ends at top left cell
D. First row and first column are set to zero
Answer» E.
137.

What is the length of a motif, in terms of amino acids residue?

A. 30- 60
B. 10- 20
C. 70- 90
D. 1- 10
Answer» C. 70- 90
138.

Which of the following is false about the ‘loop’ structure in proteins?

A. They connect helices and sheets
B. They are more tolerant of mutations
C. They are more flexible and can adopt multiple conformations
D. They are never the components of active sites
Answer» E.
139.

On average, what is the length of a typical domain?

A. About 100 residues
B. About 300 residues
C. About 500 residues
D. About 900 residues
Answer» B. About 300 residues
140.

Which of the following least describes Long Loop β-hairpins?

A. They are Often referred to as a ‘random coil’ conformation
B. Generally they are referred to as the β-meander supersecondary structure
C. Loop looks similar to the Greek Letter Ω
D. Wide-range of conformations with very specific sequence preferences
Answer» E.
141.

Which of the common structural motifs are described wrongly?

A. β-hairpin – adjacent antiparallel strands
B. Greek key – 4 adjacent antiparallel strand
C. β-α-β – 2 parallel strands connected by helix
D. β-α-β – 2 antiparallel strands connected by helix
Answer» E.
142.

WHICH_OF_THE_FOLLOWING_IS_UNTRUE_ABOUT_THE_PRSS_PROGRAM??$

A. It stands for Probability of Random Shuffles
B. It is a web-based program that can be used to evaluate the statistical significance of DNA or protein sequence alignment
C. It first aligns two sequences using the Needleman-Wunsch algorithm and calculates the score
D. It holds one sequence in its original form and randomizes the order of residues in the other sequence.
Answer» D. It holds one sequence in its original form and randomizes the order of residues in the other sequence.
143.

The_major_disadvantage_of_the_PRSS_program_is_that_it_doesn’t_allow_partial_shuffling.$#

A. True
B. False
Answer» C.
144.

It is not known whether the Gumble distribution applies equally well to gapped alignments?

A. True
B. False
Answer» B. False
145.

If the score is located in the extreme margin of the distribution, that means that the alignment between the two sequences is ______ due to random chance and is thus considered ______

A. unlikely, significant
B. unlikely, insignificant
C. unlikely, insignificant
D. very likely, significant
Answer» B. unlikely, insignificant
146.

What is used to generate parameters for the extreme distribution?

A. The pool of alignment scores from the shuffled sequences
B. A single score of a shuffled sequence
C. The pool of alignment scores from the unshuffled sequences
D. The basic optimal score computed at the beginning of the test
Answer» B. A single score of a shuffled sequence
147.

In the statistical test, randomization process in which one of the two given sequences is randomly shuffled.

A. True
B. False
Answer» B. False
148.

Which of the following is a part of the statistical test of sequences?

A. An optimal alignment between two chosen sequences is obtained at the end
B. Unrelated sequences of the same length are then generated through a randomization process
C. Unrelated sequences of the different length are then generated through a randomization process
D. Related sequences of the same length are then generated through a randomization process
Answer» C. Unrelated sequences of the different length are then generated through a randomization process
149.

Many studies have demonstrated that the distribution of similarity scores assumes a peculiar shape that resembles a highly skewed normal distribution with a long tail on one side. The distribution matches the _______

A. Gumble elective value distribution
B. Gumble extreme void distribution
C. Gumble end value distribution
D. Gumble extreme value distribution
Answer» E.
150.

By calculating alignment scores of a large number of ______ sequence pairs, a distribution model of the ______ sequence scores can be derived.

A. related, randomized
B. unrelated, randomized
C. unrelated, unrandomized
D. related, unrandomized
Answer» C. unrelated, unrandomized