Bioinformatics, 协议, DNA 核糖核酸蛋白质Proteomics

赞助/做广告 & 与我们链接 & 与我们联系 & 关于我们 & 帮助我们

家庭 > bioinformatics > 常见问题解答 > index.php

tlw tlw2

欢迎到分子岗位!

您必须 登记 在您能张贴在我们的论坛或使用我们先进的功能之前。寄存器现在! 其自由和快速!

已经登记? 登录现在下面。

用户名:

口令:

已经登记和忘记了您的口令? 点击如下收回它。

收回失去的口令

现在连接- 它是快速和自由的!

分子岗位是研究员、科学家和科学恋人大网络任何地方!

分子生物学- 科学报价

生活长期是短小, 艺术。~Hippocrates c.460 - 357 BC 。希腊医师和医学的父亲。

分子生物学时事通讯!

是! 我想要学会最新信息在分子生物学和研究! 请做我关于我的实验室工作的一位专家!
并且我想要告诉我的朋友得到我的自由PCR 章节请! 
不要让您的电子邮件担心是安全的以我们。我们恨发送同样的消息到多个新闻组尽量您。 
名字:
电子邮件:

最近论坛过帐

 

Bioinformatics 常问问题

家庭
bioinformatic 工具
学会
bioinformatics 常见问题解答
bioinformatics 研究
bioinformatics 论坛
bioinformatics 新闻
bioinformatics blog
登记

Bioinformatics 家庭

 

Bioinformatic 工具分类了

 

得知Bioinformatics

Bioinformatics 常见问题解答

研究条款在Bioinformatics

Bioinformatics 论坛

Bioinformatic 新闻

Bioinformatics Blog

Bioinformatic 登记

Bioinformatics F.A.Q 由Damian Counsell 写。

概览

内容

  • bioinformatics 的历史记录: 学科多大年纪?
  • 登记: 您可以推荐任何bioinformatics 登记吗?
  • 常规简介
  • bioinformatics 的Computational/Mathematical 方面
  • 申请bioinformatics 在生物研究
  • 小说
  • bioinformatics 登记其它列表
  • Bioinformatics 活动的中心: bioinformatics 在哪里做?
  • 研究中心
  • 程序化中心
  • 标准中心
  • 有是任何标准在bioinformatics 吗?
  • "虚拟" 中心(例如财团和社区)
  • 联机资源: 什么bioinformatics 网站有是?
  • ' Blogs
  • 信息
  • 目录
  • 门户
  • 社团
  • 工具的收集
  • 讲解
  • 教育: 那里能我学习bioinformatics...
  • ... 在非洲?
  • ... 在美洲?
  • ... 在亚洲?
  • ... 在澳大利西亚?
  • ... 在欧洲
  • ... 远程地(Distance/Correspondence 路线)
  • 事业: 我怎么可以成为bioinformatician?
  • 实用要诀: 我怎么可以应付特定, 公用bioinformatics 任务?
  • 我怎么可以查找顺序?
  • ... 我有一个说明。
  • ... 我有登录号。
  • ... 我有其它顺序。
  • ... 我不是肯定的是否使用默认值。
  • 我怎么可以排列二个顺序?
  • 我怎么可以预言基因(产品的) 功能?
  • 我怎么可以预言顺序的结构?
  • 我怎么可以记载?
  • bioinformatics 术语词汇表
  • 什么是对准线?
  • 什么是DNA 列阵?
  • 什么是同源染色体?
  • 什么是ontology?
  • 什么是一个计分的矩阵?
  • 鸣谢
  • 小字
  • 定义: 什么是Bioinformatics?

    Bioinformatics 的定义: 什么是bioinformatics?

    大致, bioinformatics 描述 对计算机的任一个用途处理生物信息

    实践上, 定义由多数人员使用是更加狭窄的; bioinformatics 对他们是一个同义词为"计算分子生物学" - - -对计算机的用途描绘生存事分子要素

    什么是Bioinformatics?---The 紧的定义

    "古典" bioinformatics

    多数生物学家谈论"做bioinformatics" 当他们使用 计算机 存储, 比较, 检索, 分析预言 构成 或原生质 结构。因为计算机变得更加强有力您能大概补充说 模拟 对bioinformatics 动词这个列表。"原生质" 包括您基因物料核酸酸和您的基因产品: 蛋白质。这些是"古典" bioinformatics 关心, 应付主要 顺序分析

    Khairuddin Itam 得出了我的对bioinformatics 的这个酥脆定义的注意建于到1987 年, 从P. Hogeweg:"[Bioinformatics 是] 信息科学的进程的研究在生物系统"

    Fredj Tekaia 在 Institut Pasteur 提供bioinformatics 的这个定义:

    "打算解决生物问题使用DNA 和氨基酸顺序和相关信息的数学, 统计和计算的方法。"

    这是多数大生物分子一个数学上兴趣的属性, 他们是 聚合物; 更加简单的分子模块被定购的链子叫 单体。认为单体作为, 尽管有不同的颜色和形状, 所有有同样厚度和同样方式连接到互相的小珠或构件。

    可能结合在链子的单体是同样常规组, 但各单体选件类有其自己明确定义的套特性。

    许多单体分子可能一起被连接形成唯一, 更大, 大分子。大分子可能有精妙地特定与信息有关的美满并且/或者化工属性。

    根据这份计划, 单体在DNA 一个指定的大分子或蛋白质可能计算上被对待作为 字母表的信函, 被汇集在被预编程序的安排传播消息或工作在电池。

    "新建" bioinformatics

    bioinformatics 方法的最巨大的成绩, 人的染色体项目, 当前被完成。因此bioinformatics 研究本质和优先级和申请更改。预兆经常居于谈话我们的生活在" 过帐genomic" 时代。我的私有意图是, 这将影响bioinformatics 用几个方式:

    这常见问题解答集中古典bioinformatics, 但将, 我希望, 成长为盖子更域的"过帐genomic" 方面。它值得注意到, 所有上述non-classical 研究领域取决于被设立的顺序分析技术。

    域的定义与Bioinformatics 有关

    什么是生物物理学?

    分子生物学 增长出于英国的生物物理学的 社团定义生物物理学 和的生物物理学The:

    "申请技术从物理学向了解生物结构和功能" 的一个学科域
    关于学科的各种各样的小平面的更多信息 可能被查找在 社团的站点 被主持在 Birkbeck 学院, 伦敦。

    麦克·Goodrich 写问什么生物物理学的状况被给了计算生物的定义由保罗·Schulte 提交(下面) 。一个 最近条款科学家 [ 自由注册被要求] 处理了这问题感谢对Jo Wixon (比较和功能 Genomics 的管理的编辑) 作为参考。

    什么是计算生物?

    计算生物学家也许反对(), 但, 我发现人们使用"计算生物" 当讨论那个子集bioinformatics (在广义) 最紧密对古典常规生物的域。

    计算生物学家感兴趣自己更以演变, 填写和理论生物而不是电池和分子生物医学。它是不可避免的, 分子生物学是深刻地重要在计算生物, 但它一定不是 什么计算生物是所有关于(参见下段) 。在计算生物这些面积看起来计算生物学家倾向于更喜欢统计设计为生物现象物理化学那些。这经常是明智的...

    一名计算生物学家(保罗J Schulte) 反对了在上面和做这个定义从对术语的一个普遍的用途派生的整个地有效点, 而不是一正确一个。在水的保罗工作流在工厂电池里。他指出, 生物可变的动力学是计算生物的域本质上。他争辩说, 这, 和计算的任一个申请对生物, 作为"计算生物" 可能被描述(参见也bioinformatics 的 "宽松" 定义 如下) 。那里我们不同意, 或许, 是在他得出这我充分再生产的结论:

    "计算生物不是" 域", 但是" 途径"介入对计算机的用途学习生物进程和因此这是面积一样不同象生物。"

    理查·Durbin, 信息学头在 Wellcome 信任Sanger 学院, 用面试表达了关于这分别的一个有趣的 观点:

    "我不认为所有生物计算是bioinformatics, 即。 数学塑造不是bioinformatics, 既使当连接用与生物相关的问题。以我所见, bioinformatics 必须处理管理和对生物信息, 特殊基因信息的随后用途。"

    什么是医疗信息学?

    医疗信息学常见问题解答 (没有关系) 提供以下定义:

    "生物医学的信息学是被定义作为结构和算法研究、发明, 和实施改进通信、体格检查信息的了解和管理的一个涌现的学科。"

    那常见问题解答并且指向得 这里

    Aamir Zakaria, 常见问题解答的作者, 强调, 医疗信息学与结构和算法更有关为医疗数据的操作, 而不是以数据。

    这建议, 一差额在bioinformatics 和医疗信息学之间作为学科说谎以他们的途径对数据; 有bioinformaticians 感兴趣对理论在那的操作之后数据 并且 有bioinformatics 科学家担忧数据和其生物涵义。(我相信一位好bioinformatics 研究员应该是对这两个域感兴趣的方面。)

    医疗信息学, 为实用原因, 是可能应付数据被获得在"更总" 生物级别是信息从超级蜂窝电话系统, 直到填写级别当多数bioinformatics 与关于蜂窝电话和biomolecular 结构和系统的信息有关。

    在这两点我会是愉快使所有医疗信息学专家 更正我

    什么是Cheminformatics?

    万维网广告为剑桥Healthtech 学院的第六个每年Cheminformatics 会议描述域因而:

    "化工综合、生物审查, 和数据开采途径的组合过去经常引导药物发现和发展"

    但这, 再, 听起来更多象域由包括被辨认由一些其最普遍(和最赚钱的) 活动, 而不是受到其常规标题的所有不同的研究。

    故事 所有时刻最成功的药物, 青霉素的当中一个, 似乎异常, 但我们发现和开发药物现在甚而有相似性的方式, 是机会的结果, 观察和很多缓慢, 密集的化学。近来, 药物设计总似乎注定继续是劳力密集的, 试算和错误进程。使用信息技术的可能性, 智能计划和自动化进程与可能的治疗化合物有关化工综合是非常扣人心弦为化学家和生化学家。带来药物的奖励销售更加迅速地是巨大的, 这那么自然地是什么很多cheminformatics 工作是关于。

    这页 以与关于术语链接"cheminformatics 的" 一些有趣的讨论的商业偏锋, 什么它意味, 是否它存在作为一个分明学科, 并且甚而是否它应该由"chemoinformatics" 替换。

    学术cheminformatics 间距是宽的和由cheminiformatics 组的利息举例证明在 分子和Biomolecular 信息学的中心 在奈梅亨 大学 在荷兰。这些利息有:

    三位一体大学' s Cheminformatics 网页, 为其它实例, 与cheminformatics 有关自己作为对互联网的用途在化学。

    什么是Genomics?

    Genomics 是存在在染色体顺序完成之前的一个域, 但在最粗暴表单, 例如经常re 参考的估计100 000 个基因在人的染色体从a(n) ("后面信包" genomics in)famous 片断被派生, 猜测重量染色体和密度基因他们负担。Genomics 是任一企图分析或比较种类或种类的整个基因补全(复数) 。它是, 当然可能比较染色体由比较更多或代表性子集基因在染色体之内。

    什么是数理生物学?

    数理生物学比计算生物是容易与bioinformatics 区别。它使用应付他们的数理生物学并且应付生物问题, 但方法不需要是数字的和不需要被实施在软件或硬件里。的确, 这样方法不需要"解决" 任何东西; 在数理生物学它会被认为合理发布仅仅设立的结果一个生物问题属于特殊常规选件类。

    分别在bioinformatics 和数理生物学之间由我接受从亚历克斯·Kasman 查尔斯顿 学院 的电子邮件阐明了。根据他运作的定义, 他区别了 的bioinformatics (在 紧的定义之下 至少)...

    "... 似乎几乎完全集中于可能向大分子生物数据集被运用..." 的特定算法

    ... 从 数理生物学 ...

    "... 包括不一定是算法的, 不一定分子在本质里, 和不一定是有用的在分析被收集的数据的事理论利益。"

    什么是Proteomics?

    最近复核在proteomics 在日记帐本质定义了域这样:

    "术语proteome 第一次铸造 描述套蛋白质由genome1 输入。proteome 的研究, 叫做proteomics, 现在召唤不仅所有蛋白质在任一个指定的电池里, 而且套所有蛋白质isoforms 和修改, 交往在他们之间, 蛋白质和他们的高次复杂的结构说明, 和就此而言几乎一切' 过帐genomic ' 。"

    迈克尔J.Dunn, Proteomics 的编辑在院长 定义"proteome" 和:

    "染色体的蛋白质补全"

    并且proteomics 有关与:

    "基因表达的定性和定量研究在功能蛋白质的级别"

    那是:

    "一个界面在蛋白质生化和分子生物学之间"

    描绘许多成千上万蛋白质用一个指定的电池类型被表达在被给时间是否评定他们的分子量或等电位点, 辨认他们ligands 或确定他们结构介入数据的浩大的编号存贮和比较。这不可避免地要求bioinformatics 。这 建设性地怀疑复核卢克斯·Huber

    什么是Pharmacogenomics?

    Pharmacogenomics 是genomic 途径和技术的申请对药物目标的确定。实例包括拖网的整个染色体为潜在感受器官通过bioinformatics 平均值, 或由调查基因表达的模式在病原生物和主机里在传染期间, 或由审查典型表达式模式被查找在肿瘤或患者范例为诊断目的(可能在对潜在癌症疗法目标的追求中) 。

    术语"pharmacogenomics" 被使用为更多"trivial" 但可争论bioinformatics 途径的更多有用申请对编目和处理信息与药理相关和遗传学的, 例如累计信息在数据库象 这一个。(由于Ivanovi 。)

    什么是药物遗传学?

    所有单个不同地回应药物处理; 一些正, 其他人与一点明显的变化在他们的情况上仍然其他人以副作用或过敏反应。这差异为人所知有一个基因基本类型。药物遗传学是使用genomic/bioinformatic 方法辨认genomic correlates, 例如SNPs 的一个子集pharmacogenomics (Single Nucleotide Polymorphisms), 典型特殊耐心回应配置文件和使用那些标记通知管理和发展疗法。醒目, 这样途径被使用了"复活" 药物早先认为是无效的, 但随后被查找对工作以在子集患者。他们可能并且被使用为优选化疗剂量为特殊患者。

    多数公用bioinformatics 程序概览

    每天bioinformatics 做以顺序搜索程序象 疾风, 顺序分析程序, 象 装饰Staden 程序包、结构预言程序象 THREADERPHD 或分子imaging/modelling 程序象 RasMolWHATIF

    多数公用bioinformatics 技术概览

    当前, 很多bioinformatics 工作与数据库有关(感谢 技术再对Ivanovi 。) 这些数据库包括基因数据两个"公共" 程序库象 GenBank蛋白质数据库 (PDB), 和专用数据库, 象那些由研究小组使用介入基因映射项目或那些由biotech 公司暂挂。使这样数据库容易接近通过开放标准非常重要。bioinformatics 数据的消费者使用计算机平台的范围: 从更加强有力和禁止的UNIX 配件箱由开发员和馆长倾向对更加友好的橡皮防水布经常查找了填写计算机机警的生物学家实验室。

    现有的程序化的数据数据库可能被使用 辨认 被放大了和程序化在实验室里新建分子的同源染色体。共享一个公用祖先属性, 同源, 可能是非常强有力的指示符在bioinformatics (参见下面) 。

    顺序数据的购买

    Bioinformatics 工具可能被使用获得基因或蛋白质顺序利益, 或从物料由各自的researchers/groups 获得, 标记, 准备和审查在电场或从顺序程序库从早先被调查的物料。

    对数据的分析

    两类型顺序可能然后被分析用许多方式与bioinformatics 工具。

    他们可能 装配。注意这是场合的当中一个当一个生物术语的含义与一计算一个明显不同(参见可笑的 混乱 在问题在基于互联网的怪杰论坛 Slashdot) 。计算机学家, 驱逐从您的头脑汇编语言所有想法。程序化可能只进行为相对地原生质的短的舒展并且被完成的顺序由安排准备因此重叠"读" 单体 (唯一小珠在一个分子链子) 入"编码" 一个唯一持续段落。 是装配件bioinformatic 感觉。

    他们可能 被映射是, 他们的顺序可能被解析查找所谓的"限制酵素" 将剪切他们的站点。

    他们可能 由排列对应的细分市场和寻找比较, 通常符合的和配错的信函在他们的顺序。充足地相似可能被关系的基因或蛋白质并且说因此是"同源的" 对各其他这整体真相比这相当复杂的。这样表兄弟告诉"同源染色体" 。

    如果同源染色体(一个相关分子) 存在, 新建被发现的蛋白质也许然后是塑造是基因产品的三维结构可能被预言没有做实验室实验。

    Bioinformatics 被使用在 更加雷管的设计。底漆依照被使用是短的顺序必要做许多复制(放大) DNA 片断在PCR ( 聚合酶链式反应) 。

    Bioinformatics 使用尝试 预言实际基因 产品的功能

    关于相似性的信息, 和, 含蓄地, 蛋白质的relatedness 使用 跟踪"系族树" 不同的分子通过演变时间。

    有计算机分析的各种各样的申请程序化数据, 但, 与非常原始数据由人的染色体项目和其它主动性被生成在生物, 计算机目前是重要为许多生物学家管理他们的每日结果

    分子塑造/结构生物是可能被认为一部分的bioinformatics 的一个增长的域。有, 例如, 允许您的工具(经常通过净额) 做相当好 预言 蛋白质附属结构升起由一个指定的氨基酸顺序, 经常根据知道的"被解决的" 结构和其他程序化的分子由结构生物学家获取。

    结构生物学家使用"bioinformatics" 处理浩大和复杂数据从X-射线结晶学、核磁共振(核磁共振) 并且电子显微镜术调查和创建似乎是到处在媒体分子的3-D 设计。

    附注

    词"映射"不幸地被使用在几个不同的方式在biology/genetics/bioinformatics 。定义被给以上是那个最频繁地被使用在这个环境, 但基因可能说"被映射" 当其父项染色体被辨认了, 当其从其它基因的实际或基因距离被设立和频繁当其各种各样的编制程序要素(其"exons 的" 结构和地点) 被设立。

    什么是Bioinformatics?---The 疏松定义

    有其他域为也许被认为一部分的bioinformatics 的实例医疗想象/ 图像分析。有并且整体生物被启发的计算其它学科; 基因算法, AI, 神经网络。经常这些面积互动用奇怪的方式。神经网络, 由作用的粗暴设计启发神经细胞在脑子, 被使用在程序称PHD, 惊奇准确地, 预言蛋白质附属结构从他们的主要顺序。

    什么几乎所有bioinformatics 有共同兴趣是处理很多生物被派生的信息, 是否DNA 程序化或乳房X-射线。

    学科多大年纪?

    "多么老的是bioinformatics?" 答复到这一个取决于哪个来源您选择读。

    从T K Attwood 和D J Parry 史密斯的"介绍Bioinformatics", Prentice 霍尔1999 年[ Longman 高等教育; 国际标准书号0582327881]:

    "术语bioinformatics 使用包含几乎所有计算机应用在生物科学, 但原始铸造了在80 年代中期为对生物顺序数据的分析。"

    从标记S. Boguski 的条款在"趋势指南对于Bioinformatics" Elsevier, 趋势补充1998 p1:

    "术语" bioinformatics "是一个相对地最近发明, 没出现在文件直到1991 年和然后只就诞生的状况电子出版业...

    "... 然而, 一些我的榜样当我是一名研究生(Margaret O. Dayhoff, Russell F. Doolittle, Walter M. Fitch and Andrew D. McLachlan) had been building databases, developing algorithms and making biological discoveries by sequence analysis since the 1960s---long before anyone thought to label this activity with a special term (if anything it was called `molecular evolution'). Even a relatively new kid on the block, the National Center for Biotechnology Information (NCBI), is celebrating its 10th anniversary this year, having been written into existence by US Congressman Claude Pepper and President Ronald Reagan in 1988. So bioinformatics has, in fact, been in existence for more than 30 years and is now middle-aged."

    Books: Can you recommend any bioinformatics books?

    It's notoriously difficult to find any books on bioinformatics itself that cater well for all of those coming from computing, from mathematics and from biology backgrounds. The few textbooks available in the field tend to be eyewateringly expensive as well. I've divided suggested reading into books of general interest, those best suited to people coming from a computational/mathematical background and books for biologists interested in bioinformatics. Where a book is also listed in Bioinformatics.Org's books section I have linked the title to the relevant entry there. Links to other lists of bioinformatics books follow this section of suggested reading.

    General introductions

    Many people are curious about the Human Genome (Project). The completion of the first draft probably represents bioinformatics' coming of age as a discipline. The first couple of books are aimed at the intelligent layperson.

    A gossipy and insightful account of the race to sequence the genome can be found in "The Sequence" by Kevin Davies [Weidenfeld; ISBN 0297646982]. Matt Ridley's "Genome" [Fourth Estate; ISBN 185702835X] is both an interesting layperson's introduction to the issues raised by the bioinformatic revolution and an overview of its biology and enormous scope. If I remember rightly, Ridley's book received a slightly snooty review from Walter Bodmer. This is understandable, since his and Robin McKie's excellent "pre-genomic" guide to the Human Genome Mapping Project, "The Book of Life" [Oxford Paperbacks; ISBN 0195114876] was undeservedly in a remainders bin when I bought my copy a couple of years ago.

    If you are a non-biological scientist (or a non-scientist) and are hooked by these, why not go back to the "real beginning" of the race and read James Watson's entertaining and indiscreet memoir of his and Francis Crick's determination of the structure of DNA, "The Double Helix" [Penguin; ISBN 0140268774]---now updated with an introduction by media don Steve Jones.

    Nigel Barber at Peterborough Regional College in the UK recommends Gary Zweiger's "Transducing the Genome" [McGraw-Hill Professional Publishing: ISBN 0071369805]. The summary at Amazon makes it sound a tad pretentious, but all the reviews seem pretty positive so it might be worth a read.

    If you are a quantitative scientist and would like a deeper knowledge of contemporary (molecular) biology, but you want to acquire it as painlessly as possible you could try the following:

    There are two classic competing texts in cell and molecular biology which Maximilian Haeussler reminds me to include: Alberts et al's Molecular Biology of the Cell [Garland Science: ISBN 0815340729] and Molecular Biology of the Gene [Benjamin Cummings: ISBN 0321248643].

    Computational/Mathematical aspects

    If you are a hardcore maths/computing person Michael Waterman's "Introduction to Computational Biology" [Chapman & Hall/CRC Statistics and Mathematics; ISBN 0412993910] and Pavel Pevzner's "Computational Molecular Biology - An Algorithmic Approach" [The MIT Press (A Bradford Book); ISBN 0262161974] will give you all the discrete maths you can shake a stick at, but perfunctory introductions to the biology.

    Bioinformatics.Org's very own Jeff Bizzaro recommends Dan Gusfield's "Algorithms on Strings, Trees and Sequences" [Cambridge, 1997 ISBN 0-52158-519-8], Richard Durbin, S. Eddy, A. Krogh, G. Mitchison "Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids" [Cambridge, 1997 ISBN 0-52162-971-3] (which I think is one of the clearest and most comprehensive guides to alignment algorithms) and---for that full "computers-to-biology conversion"--- Geoffrey M. Cooper "The Cell: A Molecular Approach" [ASM Press, 1996 ISBN 0-87893-119-8]. Jeff Ames writes that a second edition of this book is now available [Sinauer Associates, Incorporated, 2000 ISBN 0-87893-106-6] and that this version---if you can find it in the shops---comes with a CD.

    Applying bioinformatics to biological research

    One outstanding general text for the biologist is David W. Mount's "Bioinformatics" [Cold Spring Harbor Press; ISBN 0879696087]. It's not cheap, but it's the best I've seen if you are studying bioinformatics itself.

    Bioinformatics has been dismissed by some as "the science of BLAST searches". The best collection of advice so far on doing BLAST searches is O'Reilly's BLAST book by Ian Korf, Mark Yandell and Joseph Bedell [O'Reilly ISBN 0-596-00299-8]. I reviewed it enthusiastically, but not uncritically, for the UK UNIX Users' Group magazine. I'd go as far as to say that all biologists thinking of using BLAST in their research should read the relevant sections before they even go near a computer.

    If you wish to use general bioinformatics tools, especially if you are a little wary of computers, my new "best" book is "Bioinformatics for Dummies" [John Wiley and Sons ISBN 0764516965]. It is (obviously) aimed at people who are beginners, who are happier using the Web rather than typing commands, and who are more interested in learning than in impressing people---the writing is friendly clear and unpretentious. However, like several of my other tips (below) it concentrates on Web-based resources so it will, inevitably, date. (This is partially compensated for by there being a companion Website.)

    Also, if you're coming to the subject as a computer user with a biological background, looking to exploit the many tools available, you might want to try Terry Attwood and David Parry-Smith's "Introduction to Bioinformatics" [Longman Higher Education; ISBN 0582327881], or Des Higgins and Willie Taylor's "Bioinformatics: Sequence Structure and Databanks" [Oxford University Press; ISBN 0199637903]. Another excellent practical introduction is Andreas Baxevanis and Francis Oulette's "Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins" [Wiley-Interscience; ISBN 0471383910], now in its new and improved second edition. Bax teaches bioinformatics all over Canada and the experience shows. Arthur Lesk has also produced an excellent teaching book particularly for protein bioinformatics in his Introduction to Bioinformatics

    Bioinformatics.Org also recommends Cynthia Gibas and Per Jambeck's "Developing Bioinformatics Skills" [O'Reilly, 2001 ISBN 1-56592-664-1].

    Stuart Brown recommends his own book "Bioinformatics: A Biologist's Guide to Biocomputing and the Internet" [Eaton Pub Co; ISBN: 188129918X]. If he sends me a review copy I might recommend it too ;-) .

    Fiction books

    "Darwin's Radio" by Greg Bear [Ballantine Books, ISBN: 0345435249] is a wonderful hard SF thriller which stretches ideas derived from genome discoveries to their breaking point. It's gripping and humane.

    Leonard Crane, the author of Ninth Day of Creation kindly sent me a copy for review. So far it's an excellent read. I haven't finished it yet, not because it isn't a rattling good story, but because, like "Darwin's Radio", it is very long and because I am very busy. If you'd like to read a well-researched, but speculative, novel containing actual scenes of practising bioinformatics then try it.

    Ken Allen contributed the following reviews:

    "Frameshift [Tor Books, ISBN: 0812571088] by Robert J. Sawyer---based around the HGP---reasonable read, but poor / confused ending."

    Calculating God [Tor Books, ISBN: 0812580354]by the same author---has a subtler bio connection and is a much better read. Near the start an alien spacecraft lands, the alien emerges and says 'take me to your paleontologist'

    Further suggestions for this section are welcome.

    Other lists of bioinformatics books

    See also compbiology.org's list, Steve Brenner's list, and Aik Choon Tan's collection of books.

    Centres of Bioinformatics Activity: Where is bioinformatics done?

    The biggest and best source of bioinformatics links I have encountered is the Genome Web at the Rosalind Franklin Centre for Genomics Research at the Genome Campus near Cambridge, UK. Most of the links below come from that resource. My list is necessarily limited by comparison.

    Research centres

    Sequencing centres

    [XXXX INSERT DETAILS OF MORE SEQUENCING CENTRES HERE]

    Standards centres

    [XXXX INSERT DETAILS OF STANDARDS CENTRES HERE]

    What virtual centres (for example consortia and communities) for bioinformatics activity are there?

    [XXXX INSERT MORE DETAILS OF VIRTUAL BIOINFORMATICS CENTRES HERE]

    Online Resources: What bioinformatics Websites are there?

     

     

    Tutorials

    A great place to start, whether you come from a biological, physical or computational background is at Martin Vingron's superb online bioinformatics tutorial. (Begin by choosing a section from the left-hand-side menu bar.)

    Tom Smith and Don Emmeluth have produced a nice little exploration of bioinformatics using NCBI resources and tools.

    I recently stumbled upon a promising set of online lecture notes currently under construction by B. Steipe at the Genzentrum (Gene Center) at the Ludwig-Maximilians-Universität München (University of Munich).

    Chemistry for all

    A defiantly frames-free chemistry tutorial site.

    Mathematics for biologists

    First of all, an almost completely painless introduction to the horrors of the quadratic equation by Peter Whalen, James Walker, and Drew Marticorena.

    C. J. Schwarz of the Department of Statistics and Acturial Science, Simon Fraser University has produced a course in statistics which is accompanied by set of sound, online PDF handouts.

    Here is a great guide to a whole array of statistical learning/teaching resources prepared by Juha Puranen of the University of Helsinki (English).

    Computers for biologists

    Programming for biologists

    General introduction to biology for computer scientists

    Estrella Mountain Community College in the States offers this excellent short introduction to biology (actually "The Nature of Science and Biology". It's a great place for keyboard jockeys to start their journey to enlightenment. Thanks to Alex O'Neill for pointing out the broken link.

    Genetics

    The Dolan DNA Learning Center at Cold Spring Harbor has an outstanding interactive tutorial introducing genetics. To take full advantage of the multimedia elements you should download the Flash and Real players.

    Molecular biology for computer scientists

    The Institute of Arable Crop Research Beginner's Guide to Molecular Biology

    Protein chemistry for computer scientists

    Unilever Education Advanced Series tutorial on proteins.

    Cell biology for computer scientists

    The University of Arizona has made available a high-quality tutorial in cell biology. Not only does it cover the facts, but it also attempts to introduce some of the philosophy of the field---recommended. Even better, it's also available en Español and in Italiano.

    Once you've worked your way through that you might like to see some scanning electron microscope images of some of the structures you've read about taken by members of John Heuser's lab.

    Evolution for computer scientists

    Bob Patterson maintains his "Darwiniana" with amazing diligence.

    Practical bioinformatics

    Other lists of bioinformatics tutorials

    Education: Where can I study Bioinformatics...

    jump straight to introduction to education section

    This section is not complete, but contributions to broaden its coverage are welcome. Please do not direct questions about eligibility, course quality or admissions policy to me, but to ask the individual institutions directly. Use the links to obtain contact details. If an institution doesn't provide telephone numbers/email addresses or snailmail details on its Web site it doesn't deserve your patronage.

    This resource focuses on complete, full-time degree programmes rather than on individual study modules. Curating a list of the latter would be a full-time job. You can go to other places, however, if you are looking for short courses. Thanks to various contributors, including Wentian Li who pointed me to this list at Rockefeller which is mirrored at various other sites. And to Humberto Ortiz Zuazaga for mailing me a link to the ICSB, where you can find this list.

    If you are interested in U.S. programmes, here's a list from Curtin and here's a list from Stanford. Thanks to Amelie Stein who also supplied some of the individual entries in this section.

    Those wanting to find programmes in the Asia Pacific region could have a look at this resource maintained by the Asia Pacific Bioinformatics Network APBioNet. Thanks to Sentausa.

    In the UK The Bioinformatics Resource (part of the BBSRC's CCP11 project) project maintains (among many other resources) lists of (mainly) British Masters and PhDs in bioinformatics. If you have any suggestions or updates please contact me with them. You can publicize your course and offer a public service at the same time.

    Africa

    Rhodes University, Grahamstown, South Africa offers an MSc. in Bioinformatics and Computational Molecular Biology. Thanks to Natalie Twine.

    Cathal Seoighe wrote a while back about the South African National Bioinformatics Institute (SANBI). Ruediger Braeuning has since written to point out that bioinformatics training in South Africa has been radically reorganized. He says:

    "A new institute, the National Bioinformatics Nework (NBN), has been created. We have nodes at Universities all over the country (UWC, UCT, SUN, RU, UKZN, UP, WITS). Our main tasks are to:

    • develop capacity in Bioinformatics
    • perform world-class research
    • support local Biotechnology initiatives

    "We do offer courses on various topics in Bioinformatics ranging in length from 3 days to several weeks. We also train Bioinformaticists on MSc, PhD and post doc level. Undergraduate programs are currently being developed. Bursaries are available. For more information visit our Website."

    South African National Bioinformatics Institute (SANBI) Honours Bioinformatics Course at the University of the Western Cape. Next year the same institute will be offering a Master's in bioinformatics---thanks to Cathal Seoighe.

    If you know of any other bioinformatics courses on the African continent please feel free to mail me about them.

    The Americas

    Brazil

    According to Pablo Nehab-Hess the Laboratório Nacional de Computação Científica (LNCC), Brazil and the Universidade Federal do Rio de Janeiro (UFRJ) recently created a joint Bioinformatics MSc programme, through the Genetics Department of UFRJ and the Department of Applied Computational Mathematics of LNCC.

    Canada

    Thanks to Jordan Patterson for the information that the University of Alberta offers four-year Biology or Computer Science degrees with a specialization in bioinformatics. The Faculty of Computer Science there offers Master's and PhD training in bioinformatics.

    Benjamin Horsman wrote to tell me that Simon Fraser University and the University of British Columbia are collaborating on a new Bioinformatics training program with the British Columbia Cancer Agency. The program offers post-graduate diploma, Master's, and PhD training in Bioinformatics. Now Simon Fraser University also offers a joint major programme in Molecular Biology and Biochemistry (MBB) and Computer Science in Bioinformatics. Thanks to Brittany Nielsen for the info.

    Thanks to Olga Likhodi for the information that Seneca College, Toronto offers a post-graduate diploma in Bioinformatics.

    Peter Kublik informs me that from 2003/2004 the University of Calgary will offer a bioinformatics programme. He's part of the first intake.

    The University of Waterloo, Department of Computer Science offers undergraduate and graduate courses in bioinformatics. More information is here.

    California

    The Keck Graduate Institute claims that computational biology is a core element of the curriculum in its Master of Bioscience degree.

    Stanford University offers academic and professional (distance-learning) MSs in Biomedical Bioinformatics as well as its PhD programme. Thanks to Betty Cheng.

    Thanks to Momchil Georgiev for the information that the University of California at San Diego offers a Bioinformatics graduate programme and to Dana Brehm that there is now a new bachelor's program, to quote her:

    "[This is an] undergraduate, interdisciplinary program for undergraduates leading to a B.S. degree. The new Bioinformatics major is offered by the Division of Biology, and the departments of Chemistry/Biochemistry, Computer Science and Engineering, and Bioengineering. A student may choose to major in Bioinformatics in any one of the four departments or division. The Division of Biology currently offers two Bioinformatics courses, and with the advent of the cross-disicplinary major, even more courses are going to be taught 2002-03 and 2003-04."

    University of California, Irvine Informatics in Biology and Medicine

    David Delong wrote to me to point out that the College of Natural and Agricultural Sciences at the University of California, Riverside is developing a "Center in Genomics and Bioinformatics" which will offer a PhD curriculum in genomics and bioinformatics from academic year 2001-2002 onwards.

    Catherine Velazquez says that The University of California, Santa Cruz offers a new undergraduate BS course in bioinformatics. They have a Frequently Asked Questions. Now they also offer an MS/PhD in Bioinformatics. Thanks to Kevin Karplus for the update.

    Connecticut

    Javier Rojas Balderrama emailed me to point out thatYale University offers a Bioinformatics and Computational Biology track as part of its combined Biological and Biomedical Sciences graduate programme.

    Georgia

    Georgia Institute of Technology Masters of Science in Bioinformatics

    According to Eric VanWieren Georgia State University offers a Master's and PhD in Computer Science with a focus on bioinformatics. The university's Bachelor of Science in Computer Science also offers a "Fundamentals of Bioinformatics" course.

    Illinois

    The University of Illinois at Chicago offers graduate programmes covering Bioengineering Bioinformatics through its Bioengineering department as well as an undergraduate course track. Thanks to Amit Sabnis.

    Indiana

    IUPUI offers an MS programme in Bioinformatics.

    Indiana University also offers an MS programme in Bioinformatics.

    Iowa

    Iowa State University offers an Interdisciplinary Ph.D. Program in Bioinformatics and Computational Biology (BCB).

    Maine

    The Jackson Lab, a World centre of mouse genome informatics offers a graduate training program.

    Maryland

    Tim Young wrote to say that Johns Hopkins University in Maryland offers an MS in Bioinformatics through the Zanvyl Krieger School of Arts and Sciences Advanced Academic Programs and Whiting School of Engineering Engineering and Applied Science Programs for Professionals. They are also offering a Bioinfomatics concentration with their MS in Biotechnology program.

    Massachusetts

    Boston University offers a graduate programme and so does its partner North Eastern University. North Eastern also offers a Graduate Certificate in the subject.

    Brandeis University offers both a Master of Science in Bioinformatics and a Graduate Certificate in Bioinformatics. Thanks to Matt Foster.

    The Department of Computer Science at UMass Lowell offers various degrees from Bachelor's through to PhD. level in Computer Science with Bioinformatics options.

    Mexico

    At the National Autonomous University of Mexico a doctoral program in biomedical sciences is available. Their Computational Molecular Biology Group is here.

    Minnesota

    The University of Minnesota offers a graduate programme in bioinformatics. Thanks to Lynda Ellis for the up-to-date link.

    Thanks to Anu Haniharan for drawing my attention to mixing up the Minnesota and New Jersey paragraphs.

    Nebraska

    The University of Nebraska Lincoln offers an Interdisciplinary Bioinformatics Specialization.

    The Graduate Program of the Pathology-Microbiology Department at the University of Nebraska Medical Center (University of Nebraska at Omaha) offers a specialty track in bioinformatics.

    NewJersey

    Rama Penta wrote to say that Stevens Institute of Technology offers a Master's programme in Bioinformatics.

    The message also states that the University of Medicine and Dentistry New Jersey (UMDNJ) offers a programme in biomedical informatics.

    Thanks to Anu Haniharan for drawing my attention to mixing up the Minnesota and New Jersey paragraphs.

    Moustafa wrote to say that Ramapo College in New Jersey is the only school in New Jersey offering a Bachelor's degree in bioinformatics.

    New York State

    The University at Buffalo has been involved in establishing a "Center of Excellence in Bioinformatics". It used to a range of courses in bioinformatics and related subjects, but all the course links seem to be dead now. Thanks to Jeff Ligas for the original notification.

    Canisius College---also in Buffalo, NY---has had a state-approved B.S. in Bioinformatics since 2001. Thanks to Deb Burhans.

    Cornell and Rockefeller Universities, together with the Sloan-Kettering Research Institute offer a "Tri-institutional program in Computational Biology and Medicine". Thanks to Brant Inman.

    Since September 2003 Farmingdale State University of New York has offered a unique baccalaureate Bioscience curriculum including bioinformatics as one of its concentrations. Thanks to Charles Adair for this information.

    Polytechnic University in Brooklyn offers a graduate programme in bioinformatics. Thanks to Bulat K.

    Rensselaer Polytechnic Institute offers both undergraduate and graduate programmes in bioinformatics

    Rochester Institute of Technology offers BS MS and BS/MS programmes in Bioinformatics. Thanks to Brandon H.

    According to Maureen Downey, the College of Staten Island, part of the City University of New York also offers a challenging program in bioinformatics.

    If you know of any other bioinformatics courses on the American continent please feel free to mail me about them.

    North Carolina

    Duke University's Center for Bioinformatics and Computational Biology offers various bioinformatics programmes.

    The North Carolina State University Statistical Genetics and Bioinformatics Program offers Master's Bioinformatics and PhDs in bioinformatics.

    The University of North Carolina at Chapel Hill offers a programme in Bioinformatics and Computational Biology (BCB).

    Ohio

    Andrew Johnson writes: "There is a relatively new Biomedical Informatics program in Ohio. (I'm entering the program in a few months). Though the department stands alone, it is in the College of Medicine at the Ohio State Medical Center. Entrance is offered through a new Integrated Biomedical Sciences Graduate Program.".

    Pennysylvania

    The University of Pennsylvania offers some of the best known and longest established bioinformatics programmes at Batchelor's, Master's and PhD levels. Thanks to Louis Licamele for pointing out my oversight (I just assumed I'd already listed them!) He also points out that Georgetown University is planning bioinformatics courses too.

    Texas

    Tom Andrews, a student on the course, has written to me to tell me that Texas A&M University at Corpus Christi is currently offering a BS computer science degree in bioinformatics.

    Jeremy Read told me that St. Edward's University in Austin offers a B.S. in Bioinformatics.

    The Keck Center for Computational Biology---a joint venture of Baylor College of Medicine; University of Houston; Rice University; University of Texas Health Science Center, Houston; M.D. Anderson Cancer Center; and University of Texas Medical Branch, Galveston---offers undergraduate (not 2003) and graduate level training in Computational Biology.

    The University of Texas, El Paso offers a Master's in Bioinformatics.

    Virginia

    George Mason University offers both M.S. and PhD. programmes in Bioinformatics.

    The Virginia Polytechnic Institute and State University's Bioinformatics Institute offers graduate options in Bioinformatics. Thanks to William S. Preissner for correcting this entry.

    Asia

    Hong Kong

    Raymond Lau drew my attention to the Bachelor of Science degree in bioinformatics at the University of Hong Kong.

    India

    Niranjan Swaroop Sharma wrote to tell me about the Bioinformatics Institute of India which is offering a whole range of bioinformatics programmes and qualifications in both regular and distance learning formats. I would have reported on this earlier, but have not been able to view the site in Mozilla. I finally viewed the site using Konqueror today (24Jul03). Perhaps some tinkering with the ASP code is needed there...

    Vaibhav Sinha wrote to tell me that the Institute of Bioinformatics and Applied Biotechnology (IBAB) in Bangalore is offering bioinformatics courses.

    Thanks to Surjeet Singh for drawing my attention to the Indian Institute of Information Technology-Allahabd which runs a Master of Technology (M. Tech Bioinformatics) degree.

    According to Rahul Agrawal, the Indian Institute of Technology Delhi, New Delhi provides courses in Biochemical Engineering and Biotechnology. He adds that another branch of the Institute, IIT Kharagpur also provides various courses in this area.

    There is an Advanced (Graduate) Diploma in Bioinformatics in the Bioinformatics Centre at the Jawaharlal Nehru University.

    Madurai Kamaraj University in Madurai, India claims to have been the first in the country to initiate a bioinformatics programme and advanced diploma in bioinformatics at its School of Biotechnology

    Risabh Bhandari writes to say:

    "The recently rechristened CBT (Center for Biochemical Technology) [link dead 13Nov02] which is a CSIR Lab [in New] Delhi has started a PG Diploma in Bioinformatics in association with Informatics institute. The course covers a large area in the field with [its] primary focus on computational and programming concepts. The course is 6 months in duration, [and] conducted at the national Head office of [the] Informatics institute."

    The University of Pune, Maharashtra offers its MSc. in Bioinformatics and Advanced Diploma in Bioinformatics at the Bioinformatics Centre, India.

    Uma Paresmeswaran wrote to say that SASTRA, which is based near Trichy, Tamil Nadu, will be offering a B.Tech.Programme in Bioinformatics from 2003/2004, the first institute in India offering this course at the undergraduate level?

    There is, according to Aditi Arur, an MSc distance education program in Bioinformatics, offered by Sikkim Manipal University India.

    Sugandha Singhal wrote to mention the undergraduate and graduate programmes in bioinformatics at Vellore Institute of Technology in Tamil Nadu and the undergraduate programme in bioinformatics at Amity Institute, NOIDA.

    Malaysia

    Dr Amir Feisal Merican wrote to say that the Institute of Biological Sciences, Faculty of Sciences, University of Malaya, Kuala Lumpur, is offering a BSc (Bioinformatics) undergraduate degree programme. Yam confirmed this that this degree has been taught for 3 years.

    Alfred Simbun suggested three more Malaysian universities offering bioinformatics degrees: Universiti Industri Selangor (UNISEL), Kolej Universiti Teknologi & Pengurusan Malaysia (KUTPM) and Universiti Kebangsaan Malaysia (UKM)

    Kebangsaan University, Malaysia (UKM) will start to offer a Bachelor's Degree in Bioinformatics to its next intake, in July, 2003.

    Pakistan

    Thanks to Abdul Hameed for pointing out that two universities in Pakistan---COMSATS Institute of Technology and the Mohammad Ali Jinnah University---will be offer four-year Bachelor of Sciences degrees in bioinformatics from September 2003.

    Singapore

    The Bioinformatics Centre of the National University of Singapore offers Undergraduate and PhD programmes in conjunction with the life sciences departments and research institutions at NUS.

    Lam Ah Wah wrote to tell me that the Nanyang Technological University (NTU) starts a BioInformatics undergraduate and part-time post-graduate MSc course in Jul 2002. Be warned: their Web site has hideous frame/window based "portal" which breaks half a dozen rules of good interface design. Chua Hian Koon managed to find a better link, and I browsed from there to the syllabus here.

    If you know of any other bioinformatics courses is Asia please feel free to mail me about them.

    Australasia

    Australia

    The Research School of Biological Sciences, at the Australian National University in Canberra offers PhD., MSc. and Honours programs in Bioinformatics.

    You can obtain a Graduate Certificate in Bioinformatics from Curtin University of Technology in Western Australia.

    As of 2001 Flinders University in Adelaide offers a Bachelor's of Science in Bioinformatics.

    The Biochemistry Department of La Trobe University in Victoria also offers an undergraduate course in Bioinformatics.

    The University of Melbourne offers undergraduate study in Bioinformatics. Thanks to Gad.

    There are (according to H L View) PhD, MPhil and Honours programmes in bioinformatics (plus a bioinformatics minor) available at Murdoch University's Centre for Bioinformatics and Biological Computing.

    Rachel Oh said that is possible to study a near-bioinformatics programme at QUT (Queensland University of Technology): the B. Sci (biotech maj.) & IT (in software engineering & data comms) IF29. A copy of the course is available by searching their Website.

    The University of New South Wales in Sydney offers a Bachelor of Engineering in Bioinformatics.

    According to Jonathan Watts, "Queensland University of Technology in Brisbane QLD offers a Bachelor of Applied Science Innovation, with a major in Bioinformatics" from 2004.

    Sydney University in New South Wales offers a Bachelor's of Science and a postgraduate, Master of Applied Science degree in Bioinformatics. Thanks to Dominic Lau and Sebastien Gerega or the update.

    If you know of any other bioinformatics courses is Australasia please feel free to mail me about them.

    New Zealand

    Thanks to Danushka for the information that the University of Auckland, New Zealand has a BSc (Hons) in bioinformatics.

    Europe

    Austria

    A bioinformatics option is offered as part of degree courses at the Graz University of Technology (Technische Universität Graz) in Graz, Austria.

    Belgium

    A consortium including nearly all the French-speaking universities of Belgium (Bruxelles, Liège, Louvain, Mons, Namur and Gembloux) is offering the "Inter-University DEA/DES (Master) in Bioinformatics".

    The Department of Engineering at the Katholieke Universitiet of Leuvan offers a Master of Bioinformatics degree.

    Denmark

    The Bioinformatics Centre at The University of Copenhagen offers a two-year masters program in bioinformatics. Thanks to Thomas Litman.

    The Technical University of Denmark, Center for Biological Sequence Analysis offers a two-year International MSc. in bioinformatics.

    Syddansk Universitet (The University of Southern Denmark) offers both BSc- and MSc- level Bioinformatik / Experimental Bioinformatics. Thanks to Fiona Nielsen for the updated link---"Center for Experimental Bioinformatics".

    Finland

    The Finnish Graduate School in Computational Biology, Bioinformatics, and Biometry or "ComBi" is a joint venture of the University of Helsinki (English), the University of Turku (English) and the University of Tampere (English).

    France

    Fabio Pardi writes that the Université Paris VII offers a DEA en Analyse de Génomes et Modélisation Moléculaire. Thanks to Brant Inman again for this link to the course.

    Isabelle da Piedade kindly provided this list of Master's and PhD programmes in France:

    Germany

    Thanks to Amelie Stein for several of these entries.

    The Technische Fachhochschule Berlin (University of Applied Science) offers an MSc in Bioinformatics and the Freie Universität Berlin (Free University) offers both an MSc. and a BSc. in Bioinformatics. Thank you to Sebastian Kurscheid for this information.

    Alexandra Reitelmann wrote to say that Bonn-Aachen International Center for Information Technology (B-IT) is offering a new English-language Master's programme in Life Science Informatics. The B-IT is a joint venture between the University of Bonn, the RWTH Aachen University, the University of Applied Sciences Bonn Rhein-Sieg, and Fraunhofer Institutszentrum Birlinghoven Castle (IZB).

    The Institut für Informatik at Johann Wolfgang Goethe-Universität Frankfurt am Main offers a programme in Bioinformatik.

    The Fachhochschule Bingen also offers a bioinformatics degree. Thanks to Manuel Schmidt.

    Bioinformatics can be studied at the Fachhochschule (University of Applied Sciences) Oldenburg/ Ostfriesland/Wilhelmshaven. Thanks to Gerd Klaassen.

    Bioinformatics is taught at Friedrich-Schiller-Universität, Jena. Thanks to Lisa Mullan for the updated link.

    The Interdisziplinäres Zentrum für Bioinformatik at the Universität Leipzig teaches Bioinformatik.

    You can do a PhD in bioinformatics in the Department of Computational Molecular Biology at the Max Planck Institute for Molecular Genetics. Thanks to Martin Okrslar---and to Pooja Jain for the correction to my broken link.

    The Technische Universität München and Ludwig-Maximilians-Universität München also offer Bioinformatik.

    The Universität Tübingen (University of Tübingen) also offers Bioinformatik. Here are their own Frequently Asked Questions (in German only) about studying bioinformatics there.

    Tobias Kailich kindly pointed out that FH Weihenstephan in Freising (near Munich) offers opportunities to study Bioinformatik / Bioinformatics.

    Ireland

    Conor Meehan wrote to say that the National University of Ireland Maynooth set up a four-year Batchelor's course in Computational Biology and Bioinformatics two years ago.

    Israel

    Ben Gurion University, Beer Sheva offers places on the Bioinformatics Track to a select few of its admitted students to the School of Computer Science.

    Tel Aviv University offers a BSc. in Bioinformatics. Thanks to Racheli Zakarin for the link.

    The famous Weizmann Institute in Rehovot teaches an MSc. called "Multidisciplinary Program in Computational Biology and Bioinformatics". This PDF document has more information. Gad Abraham, who told me about this, points out that "all studies there are conducted in English and that there are no tuition fees"

    The Netherlands (Holland)

    The Centre for Molecular and Biomolecular Informatics (CMBI) at the University of Nijmegen offers a Master's degree in bioinformatics. This is a one or two year course leading to a degree with the formal title of "Master in Life Sciences", but the subtitle "Bioinformatics".

    Norway

    The Institutt for informatikk (Department of Informatics) of the University of Bergen, Norway offers a Master's degree in bioinformatics.

    Wageningen University offers MSc courses in Bioinformatics. Thank you to Judith Risse.

    Portugal

    There is a post-graduate programme in bioinformatics organized by the Instituto Gulbenkian de Ciência (IGC) and the Faculty of Sciences of the University of Lisboa. (Thanks to Pedro Fernandes.)

    Francisco Rocha wrote to say that Escola Superior de Biotecnologia (ESB) teaches a bioinformatics programme [follow the link labelled "Bioinformática"] in both Lisbon and Oporto. The teaching institution is the Universidade Católica do Porto.

    Sweden

    Bjorn Olsson writes that, as well as a 4-year Master's Degree in Bioinformatics, the University of Skövde offers a number of short courses and allow computer science master's students to include bioinformatics in their degree. There is more information here.

    Daniel Nilsson drew my attention to the MSc in Bioinformatics Engineering in Uppsala. Thanks to Erik Kanders for correcting the link.

    There are also opportunities to study bioinformatics on the "normal" biotech courses in Gothemburg Linköping and Umå.

    The Stockholm Bioinformatics Centre, Stockholm University, offers PhD-level shorter courses in bioinformatics subjects.

    The School of Mathematical and Computing Sciences at Chalmers offers undergraduate and Master's programmes in bioinformatics. Thanks to Samuel Hargestam.

    Switzerland

    Fabio Pardi wrote that the Swiss Institute of Bioinformatics offered a Master's degree (DEA). It was a collaboration between the Swiss Institute of Bioinformatics and three faculties of the Universities of Geneva and Lausanne. According to Javier Rojas Balderrama this programme is now closed.

    United Kingdom

    In 2002 I prepared a review of bioinformatics education in the UK for the journal Briefings in Bioinformatics. The article ends with a detailed listing of all current and some future undergraduate and graduate courses in bioinformatics the UK as of September 2002, along with links. You can read a preprint here.

    Bioinformatics is among the specialisms available on Aberdeen University's MSc/PgDip Information Technology.

    The University of Abertay, Dundee has an MSc./PG Dip in Bioinformatics. Thanks to Dr Nagesh.

    Birkbeck College is a British centre with a proud tradition in educating working and/or mature students to the highest academic standards.

    The University of Birmingham and UMIST offers undergraduate courses in bioinformatics.

    Cambridge University is planning an MPhil in Computational Molecular Biology to start in 2004-2005. Thanks to Antony Quinn for the reminder.

    In October 2004, Cardiff University started two different courses: Bioinformatics or Genetic Epidemiology and Bioinformatics either full-time or part-time and at MSc/PG Cert or Diploma level. Thanks to Ian Brewis, who pointed out that Cardiff's programme is distinguished by offering students a stronger thread of genetic epidemiology for those students interested in this.

    Cambridge University is planning an MPhil in Computational Molecular Biology to start in 2004-2005. Thanks to Antony Quinn for the reminder.

    In April 2002 City University's Bioinformatics group moved to the University of Glasgow Department of Computer Science. . Thanks to Will Bachelor for alerting me to the existence of this group. City still offers MScs in Pharmaceutical Information Management and Health Informatics

    Cranfield University at Silsoe offers an MSc. in Bioinformatics.

    Hussein Zedan pointed out that De Montfort University, Leicester was going to start its MSc. in Bioinformatics in September 2003 in both full- and part-time formats.

    The University of East Anglia offers an MSc. in Bioinformatics. Thanks to Dr Nagesh.

    Edinburgh University, offers an MSc./Diploma in Quantitative Genetics and Genome Analysis and an MRes (MSc./Diploma by Research) in Life Sciences in which you can specialize in Quantitative trait analysis and genomics .

    There are various graduate programmes offered by the University of Exeter MSc/MRes/PgCert/PgDip in Bioinformatics. (Thanks to M Antro for an update.)

    The University of Glasgow offers an MRes in Bioinformatics.

    In November 2004, Fiona Croll alerted me to Herriot-Watt University's Bioinformatics (IT) MSc jointly taught by the university's School of Mathematical and Computer Sciences and its School of Life Sciences.

    Imperial College offers a new MSc in Computational Genetics and Bioinformatics and MRes Biomolecular Sciences courses.]

    There are MRes studentships available on the courses at Leeds University.

    On 20Jan03 UKeU, the UK government-backed company set up to provide online degrees from UK universities to students worldwide, announced a new Master's level programme in Bioinformatics from the Universities of Leeds and Manchester. (Thanks again to Jo Wixon for this.)

    University of Liverpool M.Sc., Postgraduate Diploma and Postgraduate Certificate in Biosystems & Informatics

    Manchester University also teaches bioinformatics to its undergraduates as well as offering a taught MSc. course in the subject.

    Newcastle University's MRes in Bioinformatics began in September 2003.

    The University of Nottingham's undergraduate biochemistry degrees feature bioinformatics prominently.

    Oxford University has a Master's degree course with an interesting flexible structure. Thank you to Helen Parkinson and Clare Hayes for this information.

    Thank you to David Parkinson (no relation to Helen, above) for pointing out to me that for the past two years Sheffield Hallam University has offered an MSc/PGDip in Bioinformatics at its Graduate School in Science, Engineering and Technology.

    The University of Sheffield Centre for Bioinformatics and Computational Biology offers taught courses related to bioinformatics.

    Rafiu Fakunle emailed to tell me that Queen Mary, University of London offers an undergraduate degree in bioinformatics.

    Royal Holloway College in the University of London offers an MSc. in Computer Science by Research in which a bioinformatics specialism is available.

    University College London (UCL) offers a final year undergraduate course: "Bioinformatics:Genes, Proteins and Computers".

    Together with Harrow School of Computer Science, The University of Westminster, a new university in London, offers an MSc. in Bioinformatics as both a full- and part-time course. Again this is aimed primarily at graduates of the biological sciences.

    York University's Department of Biology offers Masters courses and PhDs in both computational biology and biomolecular science.

    If you know of any other bioinformatics courses in Europe please feel free to mail me about them.

    ...Remotely (Distance/Correspondence Courses)

    Many visitors to the FAQ ask about bioinformatics distance learning. Eventually I will try to gather together all those courses on this list that can be taken remotely---if I ever have the time. Unfortunately I don't at the moment. All I can suggest is that you examine the courses yourself through the links provided in the FAQ. Many can be taken over the Net or offer components that can be studied at a distance. (And, if you do compile such a list for yourself, do please email it to me and I will post it here for the benefit of our users with, as usual, a full credit for your efforts.)

    If you are thinking of studying at a UK institution you might want to search through the pre-print of my review of UK bioinformatics education for the word "distance". At the moment I think the courses at Birkbeck, Exeter and Oxford offer either full or part distance learning options.

    Careers: How can I become a bioinformatician?

    How can I get involved?

    If you want to get involved in bioinformatics, now is an exciting time, but (certainly for less senior practitioners) it looks as though demand for bioinformaticians is currently falling, partly for general economic reasons, partly, perhaps, because drugs companies in particular have been disappointed with the pay-off from their investment in the field.

    This section is opinionated; there are people in the field, both computer scientists and biologists, who I would love to provoke (or convert). If you are a newcomer, and especially if you come from one of bioinformatics component pure disciplines, I hope my ranted warnings will help you to avoid the mistakes of your predecessors---and I write as one of the mistaken. David S. Roos put it well in his review in the journal Science:

    "Lack of familiarity with the intellectual questions that motivate each side can also lead to misunderstandings. For example, writing a computer program that assembles overlapping expressed sequence tags (EST) sequences may be of great importance to the biologist without breaking any new ground in computer science. Similarly, proving that it is impossible to determine a globally optimal phylogenetic tree under certain conditions may constitute a significant finding in computer science, while being of little practical use to the biologist."

    How can I get involved?---I am a "newbie"

    Please read the education section above for information about some of the places you can currently study bioinformatics. Please do not direct questions about eligibility, course quality or admissions policy to me, but to ask the individual institutions directly.

    If you are a high school student / sixth former, think about taking an interdisciplinary computational biology or bioinformatics bachelor's degree of the sort offered at, for example, Manchester University in the UK or UPenn in the States. Don't worry if you can't find a place on such a course or there isn't one nearby; perhaps the best way to approach this subject is from two sides. Do a bachelor's degree in one area while taking a healthy interest in the other---or (if you can afford to) complement a first degree in one part of the discipline with a second degree in the second.

    If you already have a degree in a biological discipline there are similar Master's courses---both interdisciplinary (e.g. Birkbeck's in London) and conversion type courses---for biologists or others to learn computer science, for example.

    If you are currently doing a computer science or biology PhD, try to take advantage of the opportunity to take courses in the "other" discipline.

    How can I get involved?---I am a biologist

    To a biologist I would say: take as many real computing courses as you can. It's important not just to learn a programming language, but also to learn the discipline of computing; to structure and document your work in a rigorous way. What courses you take might be directed by the kind of work you are interested in doing when you graduate---whether you see yourself supporting bioinformatics applications or building them. For the former you need all-round familiarity with the programs themselves and the hardware and software needed to run them---plus your existing understanding of biology. For the latter you need to learn a structured programming language and the principles of good program design---plus the ability to talk to and understand biologists.

    Courses biologists might consider taking:

    UNIX

    Of all the computing courses available it is most important that you have a proper introduction to the UNIX operating system(s). Most current bioinformatics software (especially the free stuff) runs on "open" platforms like Linux and the Web. The UNIX philosophy is elegant, powerful, and frustrating. Master it and you will save a lot of time.

    Mathematics

    Learn some maths. Basic statistics, logic/set theory and a little calculus would be my recommendation. Many practising biologists have little or no grasp of elementary concepts like statistical significance, permutations and combinations and the principles of good experimental design. Logic will come in handy at the very least if you want to query databases in an intelligent way.

    Programming

    If you're interested in development, learn a real programming language: Pascal, C(++), Java or Fortran.

    Perl and HTML are the stuff that holds the Web together. A grasp of these is essential for a lot of the Web/database work being done by many bioinformaticians at the moment.

    Good old BASIC can be very useful as an introduction to programming or as a tool in its own right, but none of these latter languages is built to crunch numbers and tackle real world biological problems---which isn't to say people don't try...

    How can I get involved?---I am a computational/quantitative scientist

    One thing that I will emphasise repeatedly in this section is the simple value of doing some "proper" biological laboratory science. I have sat through many talks during which a bioinformatics "scientist" describes in great detail how his---it's usually "his"---application of a trendy mathematical tool offers a supposed insight into a (sometimes supposed) biological problem. Nine times out of ten I know that this method will never be so much as sneezed on by a practising biologist.

    Quantitative scientists sometimes talk about their interest in studying some aspect of "God's mind". Biologists, in contrast, are interested in "Mother Nature". You might meditate on God in the hope of some revelation, but to understand Nature you have to meet her in the flesh. You are as likely to be useful to biologists working in isolation at the keyboard as you are to conceive with your clothes on. Desk-bound bioinformaticians have written code that has turned out to be popular with biologists, but almost always because they have collaborated with biologists.

    Courses quantitative scientists might consider taking:

    Molecular biology

    "MoBi" was the bioinformatics of its day; desperately fashionable, the province of new, higher-paid practitioners and considered with slight suspicion by more traditional biologists. It was once a great achievement to sequence a modest stretch of DNA, now it's a job for robots. Today the technology of molecular biology is very well established. Scientists can buy kits to perform the sort of genetic manipulations that would make your parents' jaws drop. Some of the kits are so simple your small children could use them (with a modest amount of training and supervision).

    Despite the profusion of commercial kits, there is still a requirement for real skill in molecular biology and the general level of scientific understanding required to be a good biological scientist---rather than just completing a practical class---doesn't come easy. Living matter, the stuff you have to work with is unpredictable and responds slowly---except when it's dying. Even supposedly fast-growing bacteria can take a long time to yield up their secrets.

    Now, fashions in biomedical research are shifting from molecular biology back to cell biology and protein biochemistry, but it's well worth offering yourself up as a volunteer for some vacation work in a molecular biology lab. The term is now more often used to refer to the technological tools provided by MoBi to biology in general, rather than to fundamental research in the field itself. Those tools are common to a vast array of different kinds of research, from archaeology to zoology.

    Protein (bio)chemistry

    Protein (bio)chemistry is experiencing a revival. Proteins are still more delicate and fussy than nucleic acids. The same advice that applies to molecular biology applies to protein biochemistry. That stuff bioinformatics people refer to as "wet lab science" is much harder than it looks.

    You might find it more difficult to get access to a good protein lab than a good molecular biology lab and do protein science with real wizards, but the very least you can do is read about the theoretical aspects of the subject.

    For insights into the principles of proteins structure, try, for example, Carl Branden and John Tooze's "Introduction to Protein Structure" [Garland ISBN 0-8153-2305-0]. Physicists in particular might find the lack of general unifying principles in this area overwhelming. Unfortunately there's no substitute for acquiring a "feel" from the subject by examining a lot of examples. Still the most critical stages in the successful prediction of protein structure from sequence are those requiring human intervention.

    Thomas E. Creighton has been responsible for a range of standard texts on protein chemistry. If you are working in a protein lab you are likely to come across his "Protein Function : A Practical Approach" [ISBN 019963615X] and the rather more expensive and theoretical "Proteins : Structures and Molecular Properties" [ISBN 071677030X]

    Evolutionary biology

    It's a worn quote, but worth repeating:

    "The mechanisms that bring evolution about certainly need study and clarification. There are no alternatives to evolution as history that can withstand critical examination. Yet we are constantly learning new and important facts about evolutionary mechanisms. Nothing in biology makes sense except in the light of evolution."

    Theodosius Dobzhansky in "American Biology Teacher" vol.35

    Darwin's theory is one of the simplest and most misunderstood in science. Start with a good layperson's introduction, Richard Dawkin's "The Selfish Gene" (and remember: it's a metaphor, stupid) or Steve Jones' paraphrasing of Darwin's original "The Origin of the Species" "Almost Like a Whale". All biologists agree on the underlying principles, but they are nearly ready to kill one another over the details. After reading a decent book on evolutionary biology you should have at least a handful of good questions. Now you are ready to take a class in the subject. Take your questions with you. You'll probably start an argument---or a fight.

    You might also like to peruse Cynthia Gibas's answers to similar questions from computational scientists on the O'Reilly Web site.

    These damned biologists are making me use Word instead of LaTeX to write up---what can I do?

    Try this.

    More general advice

    Use the software

    Get access to an installation of EMBOSS and/or Staden and get someone to lead you through the tools available. RasMol is a simple, but powerful and elegant molecular imaging program which can teach you a great deal about biological macromolecules; try a tutorial. Get out on the Web and do some productive surfing for a change :-) . The best starting point is the Human Genome Mapping Project Resource Centre's "GenomeWeb". There's so much stuff out there -- and most of it is free to academics.

    Where can I find Bioinformatics jobs?

    Start here at Bioinformatics.Org's Job Announcements Homepage...

    Then move on to the appointments / careers sections of the the major scientific journals, or, better, search their Web jobs pages with "bioinformatics":

    Appropriately for a Web-dependent discipline, there are a variety of specialist commercial Web sites which carry bioinformatics jobs:

    There are also a number of companies actively recruiting in the area. Here are a few:

    Practical tips

    This section includes some simple rules-of-thumb to apply when performing common bioinformatics tasks. I try to give a reference to a more detailed source of guidance where I know of one.

    How do I find a sequence?

    The most common task in bioinformatics must be the acquisition of some bioinformatics data on which to operate. Usually this in the form of a nucleic acid or protein sequence, stored as characters in the appropriate alphabet together with a header of related information: for example some kind of unique identifying number the species from which the original biological substrate was obtained, the names of any authors who published the sequence and so on.

    You may have already generated your own sequence data experimentally. In this case you are likely to want to find sequences which are identical or similar (and therefore possibly related) to yours. The task is then one of similarity search.

    ...I have a description.

    A paradoxical problem generated by the success of the bioinformatics revolution is the increasing difficulty of navigating the huge amount of data available. Once you could print out most of the existing sequence databases onto paper and cram them into a single binder. Now a search for "actin" alone will pull out hundreds and hundreds of sequences. The key to find what you want is to develop your own discriminatory skills rather than rely on computers to figure out what it is you're really after.

    Use Entrez-PubMed

    Make sure you are clear about your aim first. If you are looking for a sequence for a specific scientific purpose then you might be best to start with a relevant human-generated publication. For example, you have cloned a gene which is part of a well-characterised biochemical pathway and you want to find other sequences of the same functional gene product in other species (orthologues) Entrez PubMed is your friend.

    PubMed is a huge and very comprehensive database of the biomedical scientific literature., created by the U.S. National Library of Medicine (NLM). Entrez PubMed is another indispensable resource of the U.S. National Centre for Biotechnology Information (NCBI). Both are part of the U.S. Department of Health and Human Services National Institutes of Health

    Use Swiss-Prot

    Swiss-Prot is curated by human beings.

    Use SRS at the RFCGR

    [XXXX INSERT DETAILED ADVICE HERE]

    Use Boolean logic

    [XXXX INSERT DETAILED ADVICE HERE]

    Use cunning

    [XXXX INSERT DETAILED ADVICE HERE]

    ...I have an accession number.

    [XXXX INSERT DETAILED SEQUENCE ADVICE HERE]

    ...I have another sequence.

    This section will be expanded---and there will be a more basic and detailed explanation for novice searchers, but, in the meantime, here are the top tips cribbed from the excellent paper by Hugh B. Nicholas Jr., David W Deerfield II and Alexander J. Ropelewski in BioTechniques.

    ...I'm not sure whether or not to use the defaults.

    Hugh, David and Alexander again on when not to use the default search parameters provided by a server.

    How can I align two sequences?

    This section will also be expanded for newbies, until then, here are Hugh, David and Alexander's tips for alignment:

    How can I predict the function of a gene (product)?

    [XXXX INSERT FUNCTION PREDICTION ADVICE HERE]

    How can I predict the structure of a sequence?

    You could start with anyone of these excellent guides (listed strictly in alphabetical order):

    How can I simulate a biomolecule?

    Here's Peter J. Steinbach's "Introduction to Macromolecular Simulation"

    How can I write up?

    Go here to download some detailed advice. Go here for more links.

    Glossary of bioinformatics terms

    Here I attempt to define some common terms in bioinformatics. I have tried to balance clarity, brevity and rigour. Let me know if I let one of these priorities over-ride the others.

    What is an alignment?

    When two symbolic representations of DNA or protein sequences are arranged next to one another so that their most similar elements are juxtaposed they are said to be aligned. Many bioinformatics tasks depend upon successful alignments. Alignments are conventionally shown as a traces.

    In a symbolic sequence each base or residue monomer in each sequence is represented by a letter. The convention is to print the single-letter codes for the constituent monomers in order in a fixed font (from the N-most to C-most end of the protein sequence in question or from 5' to 3' of a nucleic acid molecule). This is based on the assumption that the combined monomers evenly spaced along the single dimension of the molecule's primary structure. From now on I shall refer to an alignment of two protein sequences.

    Every element in a trace is either a match or a gap. Where a residue in one of two aligned sequences is identical to its counterpart in the other the corresponding amino-acid letter codes in the two sequences are vertically aligned in the trace: a match. When a residue in one sequence seems to have been deleted since the assumed divergence of the sequence from its counterpart, its "absence" is labelled by a dash in the derived sequence. When a residue appears to have been inserted to produce a longer sequence a dash appears opposite in the unaugmented sequence. Since these dashes represent "gaps" in one or other sequence, the action of inserting such spacers is known as gapping.

    A deletion in one sequence is symmetric with an insertion in the other. When one sequence is gapped relative to another a deletion in sequence a can be seen as an insertion in sequence b. Indeed, the two types of mutation are referred to together as indels. If we imagine that at some point one of the sequences was identical to its primitive homologue, then a trace can represent the three ways divergence could occur (at that point).

    Biological interpretation of an alignment

    A trace can represent a substitution:

    AKVAIL  
    AKIAIL  

    A trace can represent a deletion:

    VCGMD  
    VCG-D  

    A trace can represent a insertion:

    GS-K  
    GSGK  

    For obvious reasons I do not represent a silent mutation.

    Traces may represent recent genetic changes which obscure older changes. Here I have only represented point mutations for simplicity. Actual mutations often insert or delete several residues.

    What is a DNA array?

    Thanks to Bioinformatics.Org member Ravi Jain for the following answer, which I present verbatim.

    DNA microarrays consist of thousands of immobilized DNA sequences present on a miniaturized surface the size of a business card or less. Arrays are used to analyze a sample for the presence of gene variations or mutations (genotyping), or for patterns of gene expression, performing the equivalent of ca. 5 000 to 10 000 individual "test tube" experiments in approximately two days of time.

    Robotic technology is employed in the preparation of most arrays. The DNA sequences are bound to a surface such as a nylon membrane or glass slide at precisely defined locations on a grid. Using an alternate method, some arrays are produced using laser lithographic processes and are referred to as biochips or gene chips. The composition of DNA on the arrays is of two general types:

    DNA samples are prepared from the cells or tissues of interest. For genotyping analysis, the sample is genomic DNA. For expression analysis, the sample is cDNA, DNA copies of RNA. The DNA samples are tagged with a radioactive or fluorescent label and applied to the array. Single stranded DNA will bind to a complementary strand of DNA. At positions on the array where the immobilized DNA recognizes a complementary DNA in the sample, binding or hybridization occurs. The labeled sample DNA marks the exact positions on the array where binding occurs, allowing automatic detection. The output consists of a list of hybridization events, indicating the presence or the relative abundance of specific DNA sequences that are present in the sample.

    What is a homologue?

    "Homology" is a much-misused term and existed in biology long before the notion of protein sequences. Strictly homology cannot be qualified; it is not correct to state that two proteins are "30% homologous" with each other, for example. If we could look back far enough in the evolutionary histories of any two molecules under comparison, we would be guaranteed to find a common ancestor eventually, but this is not true homology. An example of this would be the relationship between two variants of a single ancestral enzyme resulting from a gene duplication event.

    As a rule-of-thumb, true homology should be assigned only when the feature which leads us to suspect a relationship between molecules is one we consider likely to have derived from the molecules' common ancestor. To quote Page and Holmes [Molecular Evolution: A Phylogenetic Approac, Roderick D. M. Page and Edward C. Holmes; Blackwell Scientific; ISBN 0865428891]:
    "The classic molecular example is the parallel evolution of amino acid sequences in the lysozyme enzyme in leaf-eating langur monkeys and in cows. Both animals have independently evolved foregut fermentation using bacteria, and in both cases lysozyme has been recruited to degrade these bacteria. Therefore, langur and cow lysozymes are homologous as genes; however, as digestive enzymes they are not homologous because this functionality was not present in the ancestral lysozyme"
    Although sequence determines structure, it is possible for two proteins to have very different sequences and functions and share a common fold. In fact, most gene products with similar three-dimensional structures are insufficiently similar at the sequence level for true homology or analogy (non-homologous similarity) to be distinguished.

    What is an ontology?

    Biology is changing from being a descriptive to an analytical science. Accurate and consistent descriptions are, however, vital to analysis. The idea of ontologies has been co-opted from philosophy and artificial intelligence to partition bioinformatic knowledge in a way which can be reliably navigated by computers.

    This preprint of a review by Ele Holloway of the European Bioinformatics Institute gives a more detailed insight into the varied approaches to ontologies in bioinformatics by covering a recent meeting on the subject. The final version appears in Comparative and Functional Genomics.

    What is a scoring matrix?

    The following explanation was edited from a contribution by Amelie Stein.

    The aim of a sequence alignment, is to match "the most similar elements" of two sequences. This similarity must be evaluated somehow. For example, consider the following two alignments:

    (a)
                         AIWQH                       AL-QH                     
    (b)
                         AIWQH                       A-LQH                     

    They seem quite similar: both contain one "indel" and one substitution, just at different positions. However, if we think of the letters as amino acid residues rather than elements of strings, alignment (a) is the better one, because isoleucine (I) and leucine (L) are similar sidechains, while tryptophan (W) has a very different structure. This is a physico-chemical measure; we might prefer these days to say that leucine simply substitutes for isoleucine more frequently---without giving an underlying "reason" for this observation.

    However we explain it, it is much more likely that a mutation changed I into L and that W was lost, as in (a), than that W changed into L and I was lost. We would expect that a change from I to L would not affect the function as much as a mutation from W to L---but this deserves its own topic.

    To quantify the similarity achieved by an alignment, scoring matrices are used: they contain a value for each possible substitution, and the alignment score is the sum of the matrix's entries for each aligned amino acid pair. For gaps (indels), a special gap score is necessary---a very simple one is just to add a constant penalty score for each indel. The optimal alignment is the one which maximizes the alignment score.

    PAM matrices are a common family of score matrices. PAM stands for Percent Accepted Mutations, where "accepted" means that the mutation has been adopted by the sequence in question. Thus, using the PAM 250 scoring matrix means that about 250 mutations per 100 amino acids may have happened, while with PAM 10 only 10 mutations per 100 amino acids are assumed, so that only very similar sequences will reach useful alignment scores.

    PAM matrices contain positive and negative values: if the alignment score is greater than zero, the sequences are considered to be related (they are similar with respect to the used scoring matrix), if the score is negative, it is assumed that they are not related. "Relationship" here may refer to evolution as well as functionality of the proteins, and of course the choice of the matrix affects the result, so one has to make an assumption on the similarity of the sequences in order to receive a useful result: rather distant sequences won't produce a good alignment using PAM 10, and the optimal aligment of two very similar sequences with PAM 500 may be less useful than that with PAM 50.

    Finally, it should be noted that only some scoring matrices use similarity to evaluate alignments, but others use distance, so the be careful interpreting the results!

    After this brief and necessarily superficial overview, you might want to read some more about scoring matrices.

    Acknowledgements

    Questions

    Thanks to the following people for questions:

    Links

    Thanks to the following people for corrections, links and sources:

    Answers

    Thanks to the following people for suggesting answers:

    Small Print:

    Author and licensing

    This resource is maintained by and © Damian Counsell, UK Medical Research Council Rosalind Franklin Centre for Genomic Research (the RFCGR) 1998-2004. It is made available under a modified version of the Open Publication Licence.

    The FAQ has also been mirrored, without credit or any attempt to link to the Open Content Licence, at the so-called "National Bioinformatics Institute". If you are thinking of handing over money for their "certification" you can draw your own conclusions about their standing from this fact.

    The first version of this Bioinformatics FAQ was prepared when I was responsible for bioinformatics in the Section for Cell and Molecular Biology at the Institute of Cancer Research (the ICR) in London.

    I am now a Bioinformatics Specialist at the Rosalind Franklin Centre for Genomics Research, part of the Proteomics Group and am supported by the Medical Research Council. This page does not represent their views, but I will happily read your criticisms. Although I may act on your advice I take no responsibility for anything that might happen if you browse here.

     

     

    Bid, Buy and Sell on eBay Disclaimer / Terms of Service & Privacy Policy& ©2005-2007 Molecular Station.com, All rights reserved.

    send to a friend Send this page to a friend

    Français Español 日本語 [أربيك] Italiano Deutsch 汉语 漢語 Nederlands 한국어 PortРусско
    Ελληνικά Swedish Indo Romanian Polish Norwegian Hindi Finnish Danish Czech Croatian Bulgarian English - Original language

    WARNING: SYSTRANLinks did not translate the document entirely. The document exceeds the maximum size allowed by the solution. ( 65536 bytes for HTML)