Wednesday, May 30, 2007

Biology in the Computer Age.

From the interaction of species and populations, to the function of tissues and cells within an individual organism, biology is defined as the study of living organisms. Now at the beginning of the 20th century, we sue sophisticated laboratory technology that allow us to collect data faster than we can interpret it. We have large volumes of DNA sequence data at our fingertips but how do we figure out which parts of that DNA control the various chemical processes in life?
Bioinformatics is the science of using information to understand biology. Strictly speaking, bioinformatics is a subset of the larger field of computational biology, the application of quantitative analytical techniques in modeling biological systems.
The field of bioinformatics relies heavily on work by experts in statistical methods and pattern recognition. Researchers come to bioinformatics from many fields, including mathematics, computer science, and linguistics. Unfortunately, biology is a science of the specific as well as the general. Bioinformatics is full of pitfalls for those who look for patterns and make predictions without completely understanding where biological data comes from and what it means. By providing algorithms, databases, user interfaces and statistical tools, bioinformatics makes it possible to do exciting things such as compare DNA sequences and generate results that are potentially significant. These new tools also give you the opportunity to over interpret data and assign meaning where none really exists.

What Informatics mean to Biologists....
The science of informatics is concerned with the representation, organization, manipulation, distribution, maintenance and use of informations, particularly in digital form. There is more than one interpretation of what bioinformatics actually means, and its quite possible to go out and apply for a job doing bioinformatics and find that the expectations of the job are entirely different than you though.
The functional aspect of bioinformatics is the representation, storage and distribution of data. Intelligent design of data formats and databases, creation of tools to query those databases, and development of user interfaces that bring together different tools to allow the user to ask complex questions about the data are all aspects of the development of bioinformatics infrastructure.
Developing analytical tools to discover knowledge in data is the second and more scientific aspect of bioinformatics. There are many levels at which we use biological information, whether we are comparing sequences to develop a hypothesis about the function of a newly discovered gene, breaking down known 3D protein structures into bits to find patterns that can help predict how the protein folds, or modeling how proteins and metabolites in a cell work together to make the cell function. The ultimate goal of analytical bioinformatician is to develop predictive methods that allow scientist to model the function and phenotype of an organism based only on its genome sequence. This is a grand goal, and one that will be approached only in small steps, by many scientists working together.

Challenges Biology Offers Computer Scientists....
The goal of biology is to develop a quantitative understanding of how living things are built from the genome that encodes them.
Cracking the genome code is complex. At the simplest level, we still have difficulty identifying unknown genes by computer analysis of genome sequence. We still have not managed to predict or model how a chain of amino acids fold into a specific structure of a functional protein.
Beyond the single-molecule level, the challenges are immense. The sheer amount of data in GenBank is now growing at an exponential rate, and as data types beyond DNA, RNA and protein sequence begin to undergo the same kind of explosion, simply managing, accessing and presenting this data to users in an intelligible form is critical task. Human computer interaction specialists need to work closely with academic and clinical researchers in the biological sciences to manage such staggering amounts of data.
Biological data is very complex and interlinked. A spot on a DNA array, for instance, is connected not only to immediate information about its intensity, but to layers of information about genomic location, DNA sequences, structure, function, and more. Creating information systems that allow biologists to seamlessly follow these links without getting lost in a sea of information in also a huge opportunity for computer scientists.
Finally, each gene in the genome isn't as independent entity. Multiple gene interact to form biochemical pathways, which in turn feed into other pathways. Biochemistry is influenced by the external environment, bu interaction with pathogens, and by other stimuli. Putting genomic and biochemical and physiology will be the work of a generation computational biologists. Computer scientists, mathematicians and statisticians will be a vital part of this effort.

Skills of a Bioinformatician....
  1. Should have a fairly deep background in some aspect of molecular biology.
  2. Must absolutely understand the central dogma of molecular biology. Understanding how and why DNA sequences is transcribed into RNA and translated into protein is vital.
  3. Should have substantial experience with atleast one or two major molecular biology software packages, either for sequence analysis or molecular modeling. The experience of learning one of these packages makes it much easier to use other software quickly.
  4. Should be comfortable working in command-line computing environment.
  5. Should have experience with programming in a computer language such as C/C++, as well as in scripting language like PERL, Python.

No comments: