[سبسل فتثر]

مستعملة لوح

لوحي

لوحي

[بووكمرك] علم مواد

أخبار أخيرة
[بووكمرك]/سهم هذا علم موقعة
 

جزيئيّة محطة قائمة الطعام

ترحيب إلى محطة جزيئيّة!

أنت يضطرّ سجّلت قبل أن أنت يستطيع عيّنت على ساحاتنا أو استعملت سماتنا متقدّم. سجل الآن! ه حرّة وسريعة!

سابقا يسجّل? [لوجن] الآن أدناه.

مستعملة اسم:

كلمة سرّ:

نسي سابقا يسجّل وكلمت سرّك? طقطقة أدناه أن يستردّ هو.

استردّت كلمة سرّ ضائعة

بيتيّة
سمات

سأل [بيوينفورمتيكس] غالبا أسئلة

بيتيّة
أداة [بيوينفورمتيك]
علمت حوالي
[بيوينفورمتيكس] [فق]
[بيوينفورمتيكس] بحث
[بيوينفورمتيكس] ساحة
[بيوينفورمتيكس] أخبار
[بيوينفورمتيكس] [بلوغ]
كتب

[بيوينفورمتيكس] بيتيّة

 

[بيوينفورمتيك] صنّف أداة

 

علمت حول [بيوينفورمتيكس]

[بيوينفورمتيكس] [فق]

بحث مواد على [بيوينفورمتيكس]

[بيوينفورمتيكس] ساحة

[بيوينفورمتيك] أخبار

[بيوينفورمتيكس] [بلوغ]

[بيوينفورمتيك] كتب

[بيوينفورمتيكس] [ف.ق] يكتب ب [دمين] [كونسلّ].

نظرة عامّة

[كنتنتس]

  • تاريخ ال [بيوينفورمتيكس]: كيف قديم يكون الإنضباط?
  • كتب: يستطيع أنت أوصيت أيّ [بيوينفورمتيكس] كتب?
  • تقديمات عامّ
  • حسابيّة/مظاهر رياضيّ [بيوينفورمتيكس]
  • يطبّق [بيوينفورمتيكس] في بحث أحيائيّ
  • تخيّل
  • أخرى قائمة ميلان إلى جانب من [بيوينفورمتيكس] كتب
  • [سنترس] من [بيوينفورمتيكس] نشاط: أين يكون [بيوينفورمتيكس] أتمّت?
  • بحث [سنترس]
  • يسلسل [سنترس]
  • معايير [سنترس]
  • يكون هناك أيّ معايير في [بيوينفورمتيكس]?
  • "فعليّة" [سنترس] (مثلا إتحادات وجماعات)
  • مواقد متوفّر على شبكة الإنترنات: ما [بيوينفورمتيكس] موقعات يكون هناك?
  • '[بلوغس]
  • معلومة
  • دلال
  • مداخل
  • مجتمع
  • تجميع الأداة
  • [تثتوريلس]
  • تربيّة: درست حيث أستطيع أنا [بيوينفورمتيكس]…
  • … في إفريقيا?
  • … في [أمريكس]?
  • … في آسيا?
  • … في أسترالازيا?
  • … في أوروبا
  • … بعيدا (بعد/[كرّسبوندنس كورس])
  • مهن: كيف يستطيع أنا أصبحت [بيوينفورمتيسن]?
  • طرف عمليّة: كيف يستطيع أنا عالجت خاصّ, عاديّ [بيوينفورمتيكس] مهرات?
  • كيف يستطيع أنا وجدت تسلسل?
  • … يتلقّى أنا وصف.
  • … يتلقّى أنا [أكّسّيون نومبر].
  • … يتلقّى أنا آخر تسلسل.
  • … ليس أنا يوقن ما إذا أن يستعمل التقديرات.
  • كيف يستطيع أنا حاذيت اثنان تسلسل?
  • كيف يستطيع أنا تنبّأت العمل من مورثة (منتوج)?
  • كيف يستطيع أنا تنبّأت البنية من تسلسل?
  • كيف يستطيع أنا [وريت وب]?
  • معجم من [بيوينفورمتيكس] عبارات
  • ماذا يكون إستقامة?
  • ماذا يكون [دنا] صفح?
  • ماذا يكون نظير?
  • ماذا يكون علم وجود?
  • ماذا يكون يحرز مادّة ترابط?
  • [أكنوولدغمنتس]
  • طبق صغيرة
  • تعريف: ماذا يكون [بيوينفورمتيكس]?

    تعريف [بيوينفورمتيكس]: ماذا يكون [بيوينفورمتيكس]?

    تقريبا, يصف [بيوينفورمتيكس] أيّ إستعمال الحاسوب أن يعالج معلومة أحيائيّ.

    فعلا, التعريف يستعمل ب كثير الناس ضيّقة; [بيوينفورمتيكس] إلى هم مرادف ل "علم الأحياء حسابيّة جزيئيّة"---الإستعمال الحاسوب أن يميّز العناصر جزيئيّة من أشياء حيّ.

    ماذا يكون [بيوينفورمتيكس]? ---التعريف مشدودة

    "كلاسيكيّة" [بيوينفورمتيكس]

    [تلك بووت] كثير عالم الأحياء "يتمّ [بيوينفورمتيكس]" عندما هم يستعملون حاسوب أن يخزّن, قارنت, استرجعت, حلّلت أو تنبّأت الترحيب أو البنية الجزيء حيويّ. بما أنّ حاسوب يصبحون أكثر قوّيّة أنت استطاع على الأرجح أضفت يتظاهر إلى هذا قائمة ميلان إلى جانب من [بيوينفورمتيكس] أفعال. "يتضمّن جزيء حيويّ" مادتك وراثيّة---[نوكليك سد]---والمنتوجات من مورثاتك: بروتينات. هذا الاهتمامات من "كلاسيكيّة" [بيوينفورمتيكس], يعالج أوّلا مع تسلسل تحميل.

    سحب [كهيرودّين] [إيتم] إنتباهي إلى هذا تعريف هشّة [بيوينفورمتيكس] [دت بك] إلى 1987, من [ب.] [هوجوغ]: "[[بيوينفورمتيكس]] الدراسة من عمليات [إينفورمتيك] في نظامات حياتيّة"

    يقدّم [فردج] [تكيا] في [إينستيتثت] [بستيور] هذا تعريف ال [بيوينفورمتيكس]:

    "الرياضيّ, إحصائيّة ويحسّ طرق أنّ يهدف أن يحلّ مشكلة أحيائيّ يستعمل [دنا] و [أمينو سد] تسلسل ومعلومة متّصلة."

    هو حسابيّا يهمّ خاصّيّة من كثير جزيئات كبير أحيائيّ أنّ هم بوليمر; دعا يمرّ سلسلة من وحدة نمطيّة بسيط جزيئيّة مونومر. فكّرت من المونومر كخرزة أو بناية قالب أيّ, على الرغم من يتلقّى مختلفة ألواح وأشكال, كلّ يتلقّى ال نفسه سماكة وال نفسه طريق من يربط إلى واحدة آخر.

    مونومر أنّ يستطيع ضممت في سلسلة من ال نفسه صنف عامّ, غير أنّ كلّ نوع المونومر داخل أنّ يتلقّى صنفه خاصّة مجموعة [ولّ-دفيند] أصفار.

    كثير مونومر جزيئات يستطيع كنت تلاقيت معا أن يشكّل وحيد, بعيدا كبير, جزيئة ضخمة. جزيئة ضخمة يستطيع يتلقّى [إإكسقويستلي] خاصّ إعلاميّة راضي [أند/ور] خاصّيّة كيميائيّة.

    وفقا ل هذا خطوة, المونومر في يعطى جزيئة ضخمة ال [دنا] أو بروتين يستطيع كنت عاملت معلوماتيّا كحرف من أبجديّة, يوضع معا في [بر-بروغرمّد] ترحيبات أن يحمل رسائل أو يعمل في خلية.

    "جديدة" [بيوينفورمتيكس]

    أتمّت الإنجاز عظيم من [بيوينفورمتيكس] طرق, الإنسانيّة جينات مشروع, حاليّا يكون. بسبب هذا ال يغيّب طبيعة وأولية من [بيوينفورمتيكس] بحث وتطويقات. عمّرت غالبا محادثة [بورتنتووسلي] من معيشتنا في ال "[بوست-جنوميك]" عصر. منظري شخصيّة أنّ سيأثر هذا [بيوينفورمتيكس] في عدّة طرق:

    • الآن يملك نحن يتعدّد جينات كاملة نحن يستطيع فتّشت فروق وتشابه بين [ألّ ث] مورثات من يتعدّد نوع. من هذا دراسات نحن يستطيع سحبت استنتاجات خاصّ حول نوع وجنرال أحد حول تطوّر. أحلت هذا نوع العلم غالبا ك [جنوميكس] نسبيّة.
    • هناك الآن تكنولوجيا يصمّم أن يقيس الرقم نسبيّة نسخ من رسالة وراثيّة (مستويات من مورثة تعبير) في مرحلة مختلفة في تطوير أو مرض أو في نساج مختلفة. سينمو هذا تكنولوجيا, مثل [دنا] [ميكروأرّس] في أهمّيّة.
    • سينمو أخرى, أكثر مباشر, طرق ضخم من يعيّن مورثة أعمال وجمعيّة (مثلا خميرة [توو-هبريد] طرق) في أهمّيّة ومع هم ال يرافق [بيوينفورمتيكس] من [جنوميكس] وظيفيّة.
    • هناك سيكون تغيّر عامّ في توصيد (من تسلسل تحميل خصوصا) من مورثات بنفسي إلى مورثة منتوجات. هذا سيقود إلى:
      • محاولات أن يفهرس الأنشطة وميّزت تفاعلات بين كلّ مورثة منتوجات (في أناس): [بروتيوميكس]).
      • محاولات أن يبلور وأو تنبّأت البنى من كلّ بروتينات (في أناس): [جنوميكس] إنشائيّة.
      • قليل من [دنا] [دووبل-هليسس] في سيّئة [سكي-في] أفلام.
    • ماذا بعض الناس يحيلون كبحث أو معلوماتيّة طبيّة, الإدارة من كلّ معطيات طبّيّ أحيائيّ تجريبيّ يصحب مع جزيئات خاصّ أو مريض---من مطيافيّة شاملة, إلى فحوصات [إين فيترو] إلى [سد-فّكتس] سريريّة---سيتحرّك من الاهتمام من أنّ يعمل في [دروغ كمبني] ومستشفى [إي.ت.] (معلومة تكنولوجيا) داخل التيّار رئيسيّ من خلية وعلم الأحياء جزيئيّة وسيهاجر من التجاريّة وسريريّة إلى قطاعات أكاديميّ.
    يركّز هذا [فق] على [بيوينفورمتيكس] كلاسيكيّة, غير أنّ يريد, أنا يأمل, [غروو تو] تغذية أكثر من ال "[بوست-جنوميك]" مظاهر من المجال. هو [ب وورث] يلاحظ أنّ يعتمد [ألّ وف ث] [أرا وف رسرش] آنفة [نون-كلسّيكل] على يؤسّس تسلسل تحميل تقنيات.

    تعريف المجالات يرتبط إلى [بيوينفورمتيكس]

    ماذا يكون علم طبيعة حيويّة?

    نما علم الأحياء جزيئيّة بنفسي من علم طبيعة حيويّة. يعيّن المجتمع [بريتيش] فيزيائيّ حيويّ علم طبيعة حيويّة بما أنّ:

    "مجال بين القطاعات أيّ يطبّق تقنيات من الأعلام طبيعيّ إلى يفهم أحيائيّ بنية وعمل"
    كثير معلومة حول السطيحات مختلفة من الإنضباط يستطيع كنت أسّست في المجتمع موقعة يستضاف في [بيركبك] كلّيّة, لندن.

    كتب [ميك] [غودريش] أن يسأل ماذا الوضع العلم طبيعة حيويّة كان أعطيت التعريف من علم الأحياء حسابيّة يقدّم ببول [سكهولت] (أدناه). عالج مادة أخيرة في العالم [تسجيل حرّة يتطلّب] مع هذا سؤال---شكور إلى [جو] [ويإكسون] (يدير محرّر من نسبيّة و [جنوميكس] وظيفيّة) لالمرجع.

    ماذا يكون علم الأحياء حسابيّة?

    أتمّت عالم الأحياء حسابيّة أمكن اعترضت (رجاء), غير أنّ, أنا أجد أنّ الناس يستعملون "علم الأحياء حسابيّة" عندما يتناقش أنّ فئة فرعيّة ال [بيوينفورمتيكس] (في الإحساس واسعة) [كلوسست] إلى المجال من علم الأحياء كلاسيكيّة عامّ.

    يهمّبنفسي عالم الأحياء حسابيّة أكثر مع تطوّريّ, السّكان وعلم الأحياء نظريّة [رثر ثن] خلية وطبّ أحيائيّ جزيئيّة. هو يتحتّم أنّ علم الأحياء جزيئيّة بعمق مهمّة في علم الأحياء حسابيّة, غير أنّ ليس هو بالتّأكيد ماذا علم الأحياء حسابيّة يكون جميعا حول (يرى فقرة تالية). في هذا أمنان من علم الأحياء حسابيّة يبدو هو أنّ عالم الأحياء حسابيّة قد مالوا أن يفضفض نماذج إحصائيّة لأحيائيّ [فنومنوم] على [فسك-شميكل] أحد. هذا غالبا حكيم…

    اعترض واحدة عالم الأحياء حسابيّة (بول [ج] [سكهولت]) إلى الآنفة ويجعل النقطة شرعيّة كلّيّا أنّ هذا تعريف يستنتج من إستعمال شعبيّة من العبارة, [رثر ثن] يصحّ واحدة. بول يعمل على [وتر فلوو] في معمل خلايا. هو يشير أنّ علم حركة أحيائيّ مائع مجال من علم الأحياء حسابيّ في بنفسي. رأيت هو يجادل أنّ هذا, وأيّ تطويق من يحسّ إلى علم الأحياء, يستطيع كنت وصفت بما أنّ "علم الأحياء حسابيّة" (أيضا ال "ساحبة" تعريف ال [بيوينفورمتيكس] أدناه). حيث نحن نتعارض, ربّما, في الاستنتاج هو يسحب من هذا---أيّ أنا أنسخ [إين فولّ]:

    "ليس علم الأحياء حسابيّة "مجال", غير أنّ "مقاربة" يتضمّن الإستعمال الحاسوب أن يدرس عمليات أحيائيّ وبالتّالي هو من مثل متنوّع بما أنّ علم الأحياء بنفسي."

    عبّر عن ريتشارد [دوربين], رأي المعلوماتيّة في [ولّكم] ثقة [سنجر] معهد, رأي ممتعة على هذا تمايز في مقابلة:

    "لا يفكّر أنا كلّ أحيائيّ يحسّ [بيوينفورمتيكس], [إ.غ.] رياضيّ يشكّل ليس [بيوينفورمتيكس], [إفن وهن] يربط مع مشكلة [بيولوج-رلتد]. في رأيي, [بيوينفورمتيكس] يضطرّ أتمّت مع إدارة والإستعمال لاحقة من معلومة أحيائيّ, معلومة خاصّ وراثيّة."

    ماذا يكون معلوماتيّة طبيّة?

    الطبيّة معلوماتيّة يزوّد [فق] (ما من علاقة) التعريف تالي:

    "معلوماتيّة طبّيّ أحيائيّ [إمرجنغ] إنضباط أنّ يتلقّى يكون عيّنت كالدراسة, إختراع, وتزويد من بنى وخوارزميات أن يحسن إتصال, تفهّم وإدارة من معلومة طبيّة."

    أنّ يدلّ [فق] أيضا هنا

    يؤكّد [أمير] [زكريا], المؤلّف من ال [فق], أنّ معلوماتيّة طبيّة أكثر تعلّقت مع بنى وخوارزميات للمعالجة من معطيات طبيّة, [رثر ثن] مع المعطيات بنفسي.

    هذا يقترح أنّ يكذب واحدة فرق بين [بيوينفورمتيكس] ومعلوماتيّة طبيّة بما أنّ إنضباطات مع مقاربتهم إلى المعطيات; هناك [بيوينفورمتيسنس] يهمّ في النظريّة خلف المعالجة من أنّ معطيات وهناك [بيوينفورمتيكس] عوالم يتعلّق مع المعطيات بنفسي وتضمّنه أحيائيّ. (يصدق أنا أنّ جيّدة [بيوينفورمتيكس] باحثة سوفت كنت راغبة في [بوث وف ثيس] مظاهر من المجال.)

    معلوماتيّة طبيّة, لأسباب عمليّة, أكثر مرجّحة أن يعالج مع معطيات ينال في "[غروسّر]" مستويات أحيائيّ---أنّ معلومة من نظامات [سوبر-سلّولر], [ريغت وب تو] الالسّكان مستوى---بينما كثير [بيوينفورمتيكس] يكون تعلّقت مع معلومة حول خلويّ وحيويّ جزيء بنى ونظامات.

    على [بوث وف ثيس] نقطات كان أنا سعيد ل أيّ طبيّة معلوماتيّة إختصاصي أن يصحّني.

    ماذا يكون [شمينفورمتيكس]?

    النسيج يصف إعلان لكمبريدج [هلثتش] معهد سادسة سنويّة [شمينفورمتيكس] مؤتمر المجال لذلك:

    "استعمل الإدماج من تأليف كيميائيّة, أحيائيّ غربلة, و [دت-مينينغ] مقاربة أن يرشد عقّار إكتشاف وتطوير"

    غير أنّ يصوّل هذا, ثانية, أشبه مجال يكون يعيّن ب بعض من ه أكثر شعبيّة (ومربحة) أنشطة, [رثر ثن] ب يتضمّن [ألّ ث] دراسات متنوّع أنّ يأتي تحت ترويسته عامّ.

    يبدو القصة من واحدة من العقّار ناجح أكثر من كلّ وقت, بنسيلين, غريبة, غير أنّ الطريق نحن نكتشف ويطوّر عقّار حتّى الآن يتلقّى تشابه, يكون النتيجة الفرصة, بطاقة و [ا لوت] من بطيء, كيمياء شديدة. [أونتيل رسنتلي], عقّار بدا تصميم دائما يحكم أن يستمرّ أن يكون [لبوور-ينتنسف], [تريل-ند-رّور] عملية. الإمكانية من يستعمل معلومة تكنولوجيا, أن يخطّط بذكاء وأن يشغل عمليات يرتبط إلى التأليف كيميائيّة من يمكن مركّب [ثربيوتيك] جدّا مثير لصيدليات وكيميائيّ حيويّ. المكافآت ل يحضر عقّار أن يتسوّق أكثر بسرعة ضخمة, هكذا بشكل طبيعيّ هذا ماذا [ا لوت] من [شمينفورمتيكس] أعمال يكون حول.

    هنا صفحة مع ميل تجاريّة أيّ يقترن إلى بعض نقاشات ممتعة من العبارة "[شمينفورمتيكس]", ماذا هو يعني, ما إذا أو لا يتواجد هو كإنضباط بارزة, وحتّى ما إذا هو سوفت كنت استبدلت ب "[شموينفورمتيكس]".

    الفسحة بين دعامتين من [شمينفورمتيكس] أكاديميّ يوسع ومثّلت بالفوائد من ال [شمينيفورمتيكس] مجموعة في المركز لجزيئيّة ومعلوماتيّة حيويّ جزيء في الجامعة [نيجمجن] في ال [نثرلندس]. يتضمّن هذا فوائد:

    • تأليف تخطيط
    • ردّ فعل وبنية إسترجاع
    • [3-د] بنية إسترجاع
    • يشكّل
    • كيمياء حسابيّة
    • [فيسوليسأيشن] أداة ومنفعة

    ثالوث جامعة [شمينفورمتيكس] يتعلّقبنفسي [وب بج], ل آخر مثال, مع [شمينفورمتيكس] كالإستعمال من الإنترنت في كيمياء.

    ماذا يكون [جنوميكس]?

    [جنوميكس] مجال أيّ تواجد قبل الإنجاز من التسلسل الجينات, غير أنّ في ال [كرودست] من أشكال, مثلا ال [أفت-ر-رفرنسد] تقدير من 100 000 مورثات في الجينات إنسانيّة يستنتج من [ا] ([ن]) (داخل) قطعة مشهورة من "خلفيّ من غلاف" [جنوميكس], يخمّن الوزن الصبغيّات والكثافة من المورثات يحمل هم. [جنوميكس] أيّ محاولة أن يحلّل أو قارنت التكملة كاملة وراثيّة من نوع أو نوع ([بلورل]). هو, [أف كورس] يمكن أن يقارن جينات ب يقارن [مور-ور-لسّ] فئة فرعيّة تمثيليّ مورثات ضمن جينات.

    ماذا يكون علم الأحياء رياضيّ?

    علم الأحياء رياضيّ يتيح أن يميّز من [بيوينفورمتيكس] من علم الأحياء حسابيّة. يعالج علم الأحياء رياضيّ أيضا مشكلة أحيائيّ, غير أنّ الطرق هو يستعمل أن يعالجهم يحتاجون لا يكون عدديّة ويحتاج لا يكون طبّقت في برمجيّة أو جهاز. حقّا, يحتاج هذا طرق لا "يحلّ" أيّ شيء; في علم الأحياء رياضيّ اعتبرت هو كنت معقولة أن ينشر نتيجة أيّ فقط يؤسّس أنّ مشكلة أحيائيّ ينتسب إلى صنف خاصّ عامّ.

    أضءت التمايز بين [بيوينفورمتيكس] وعلم الأحياء رياضيّ كان ببريد إلكترونيّ أنا استلمت من أليكس [كسمن] في الكلّيّة شارلستون. وفقا ل ه يعمل تعريف, ميّز هو [بيوينفورمتيكس] أيّ (تحت التعريف مشدودة على الأقلّ)…

    يبدو "… أن ركّز على تقريبا حصريّا على خوارزميات خاصّ أنّ يستطيع كنت طبّقت إلى كبير جزيئيّ أحيائيّ معطيات مجموعة…"

    … من علم الأحياء رياضيّ أيّ…

    يتضمّن "… أشياء من فائدة نظريّة أيّ ليس بالضّرورة خوارزميّة, لا بالضّرورة جزيئيّة في طبيعة, وليس بالضّرورة مفيدة في يحلّل يجمع معطيات."

    ماذا يكون [بروتيوميكس]?

    عيّن مراجعة أخيرة على [بروتيوميكس] في الجريدة طبيعة المجال هذا طريق:

    "العبارة سككت [بروتيوم] كان أولى أن يصف المجموعة البروتينات يرمّز ب ال [جنوم1]. يثير الدراسة من ال [بروتيوم], يدعى [بروتيوميكس], الآن ليس فحسب [ألّ ث] بروتينات في أيّ يعطى خلية, غير أنّ أيضا المجموعة من كلّ بروتين [إيسفورمس] وتعديل, التفاعلات بين هم, الوصف إنشائيّة من بروتينات ومركّبهم [هيغر-وردر], ول أنّ أمر تقريبا كلّ شيء "[بوست-جنوميك]"."

    يعيّن مايكل [ج.دونّ], ال [إديتور-ين-شف] [بروتيوميكس] ال "[بروتيوم]" بما أنّ:

    "ال [بروتين] تكملة من ال [جنوم]"

    و [بروتيوميكس] أن يكون تعلّقت مع:

    "نوعيّة ودراسات كمّيّة من مورثة تعبير في المستوى من البروتينات وظيفيّة بنفسي"

    أنّ:

    "قارن بين بروتين كيمياء حيويّة وعلم الأحياء جزيئيّة"

    يميّز ال كثير [تن] الآلاف البروتينات عبّر عن في يعطى خلية نوع [أت ا جفن تيم]---ما إذا يقيس هم [مولكلر ويغت] أو نقطات متساوي جهد كهربائيّ, يعيّن [ليغند] هم أو يحدّد بنىهم ---يتضمّن التخزين ومقارنة من رقم ضخمة معطيات. حتميّا يتطلّب هذا [بيوينفورمتيكس]. هنا مراجعة [سكبتيك] بشكل بنّاء ب [لوكس] [هوبر].

    ماذا يكون [فرمكجنوميكس]?

    [فرمكجنوميكس] التطويق من [جنوميك] مقاربة وتكنولوجيا إلى التحقّق من عقّار أهداف. مثال يتضمّنون يصيد جينات كاملة لمستقبل ممكنة ب [بيوينفورمتيكس] [منس], أو ب يتحرّى أساليب من مورثة تعبير في على حدّ سواء ممرض ومضيف أثناء تلوث, أو ب يفحص المميّزة تعبير أساليب يؤسّس في أورام أو مريض عينات لغرض تشخيصيّ (من المحتمل في المطاردة من ممكنة سريان معالجة أهداف).

    استعملت العبارة "[فرمكجنوميكس]" for the more "trivial"---but arguably more useful---application of bioinformatics approaches to the cataloguing and processing of information relating to pharmacology and genetics, for example the accumulation of information in databases like this one. (Thanks to Ivanovi.)

    What is Pharmacogenetics?

    All individuals respond differently to drug treatments; some positively, others with little obvious change in their conditions and yet others with side effects or allergic reactions. Much of this variation is known to have a genetic basis. Pharmacogenetics is a subset of pharmacogenomics which uses genomic/bioinformatic methods to identify genomic correlates, for example SNPs (Single Nucleotide Polymorphisms), characteristic of particular patient response profiles and use those markers to inform the administration and development of therapies. Strikingly, such approaches have been used to "resurrect" drugs thought previously to be ineffective, but subsequently found to work with in subset of patients. They can also be used for optimizing the doses of chemotherapy for particular patients.

    Overview of most common bioinformatics programs

    Everyday bioinformatics is done with sequence search programs like BLAST, sequence analysis programs, like the EMBOSS and Staden packages, structure prediction programs like THREADER or PHD or molecular imaging/modelling programs like RasMol and WHATIF.

    Overview of most common bioinformatics technology

    Currently, a lot of bioinformatics work is concerned with the technology of databases (Thanks again to Ivanovi.) These databases include both "public" repositories of gene data like GenBank or the Protein DataBank (the PDB), and private databases, like those used by research groups involved in gene mapping projects or those held by biotech companies. Making such databases accessible via open standards is very important. Consumers of bioinformatics data use a range of computer platforms: from the more powerful and forbidding UNIX boxes favoured by the developers and curators to the far friendlier Macs often found populating the labs of computer-wary biologists.

    Databases of existing sequencing data can be used to identify homologues of new molecules that have been amplified and sequenced in the lab. The property of sharing a common ancestor, homology, can be a very powerful indicator in bioinformatics (see below).

    Acquisition of sequence data

    Bioinformatics tools can be used to obtain sequences of genes or proteins of interest, either from material obtained, labelled, prepared and examined in electric fields by individual researchers/groups or from repositories of sequences from previously investigated material.

    Analysis of data

    Both types of sequence can then be analysed in many ways with bioinformatics tools.

    They can be assembled. Note that this is one of the occasions when the meaning of a biological term differs markedly from a computational one (see the amusing confusion over the issue at Web-based geek forum Slashdot). Computer scientists, banish from your mind any thought of assembly language. Sequencing can only be performed for relatively short stretches of a biomolecule and finished sequences are therefore prepared by arranging overlapping "reads" of monomers (single beads on a molecular chain) into a single continuous passage of "code". This is the bioinformatic sense of assembly.

    They can be mapped---that is, their sequences can be parsed to find sites where so-called "restriction enzymes" will cut them.

    They can be compared, usually by aligning corresponding segments and looking for matching and mismatching letters in their sequences. Genes or proteins that are sufficiently similar are likely to be related and are therefore said to be "homologous" to each other---the whole truth is rather more complicated than this. Such cousins are called "homologues".

    If a homologue (a related molecule) exists, then a newly discovered protein may be modelled---that is the three dimensional structure of the gene product can be predicted without doing laboratory experiments.

    Bioinformatics is used in primer design. Primers are short sequences needed to make many copies of (amplify) a piece of DNA as used in PCR (the Polymerase Chain Reaction).

    Bioinformatics is used to attempt to predict the function of actual gene products.

    Information about the similarity, and, by implication, the relatedness of proteins is used to trace the "family trees" of different molecules through evolutionary time.

    There are various other applications of computer analysis to sequence data, but, with so much raw data being generated by the Human Genome Project and other initiatives in biology, computers are presently essential for many biologists just to manage their day-to-day results

    Molecular modelling / structural biology is a growing field which can be considered part of bioinformatics. There are, for example, tools which allow you (often via the Net) to make pretty good predictions of the secondary structure of proteins arising from a given amino acid sequence, often based on known "solved" structures and other sequenced molecules acquired by structural biologists.

    Structural biologists use "bioinformatics" to handle the vast and complex data from X-ray crystallography, nuclear magnetic resonance (NMR) and electron microscopy investigations and create the 3-D models of molecules that seem to be everywhere in the media.

    note

    Unfortunately the word "map" is used in several different ways in biology/genetics/bioinformatics. The definition given above is the one most frequently used in this context, but a gene can be said to be "mapped" when its parent chromosome has been identified, when its physical or genetic distance from other genes is established and---less frequently---when the structure and locations of its various coding components (its "exons") are established.

    What is Bioinformatics?---The Loose definition

    There are other fields---for example medical imaging / image analysis which might be considered part of bioinformatics. There is also a whole other discipline of biologically-inspired computation; genetic algorithms, AI, neural networks. Often these areas interact in strange ways. Neural networks, inspired by crude models of the functioning of nerve cells in the brain, are used in a program called PHD to predict, surprisingly accurately, the secondary structures of proteins from their primary sequences.

    What almost all bioinformatics has in common is the processing of large amounts of biologically-derived information, whether DNA sequences or breast X-rays.

    How old is the discipline?

    "How old is bioinformatics?" The answer to this one depends on which source you choose to read.

    From T K Attwood and D J Parry-Smith's "Introduction to Bioinformatics", Prentice-Hall 1999 [Longman Higher Education; ISBN 0582327881]:

    "The term bioinformatics is used to encompass almost all computer applications in biological sciences, but was originally coined in the mid-1980s for the analysis of biological sequence data."

    From Mark S. Boguski's article in the "Trends Guide to Bioinformatics" Elsevier, Trends Supplement 1998 p1:

    "The term "bioinformatics" is a relatively recent invention, not appearing in the literature until 1991 and then only in the context of the emergence of electronic publishing...

    "...However, some of my role models when I was a graduate student (Margaret O. Dayhoff, Russell F. Doolittle, Walter M. Fitch and Andrew D. McLachlan) had been building databases, developing algorithms and making biological discoveries by sequence analysis since the 1960s---long before anyone thought to label this activity with a special term (if anything it was called `molecular evolution'). Even a relatively new kid on the block, the National Center for Biotechnology Information (NCBI), is celebrating its 10th anniversary this year, having been written into existence by US Congressman Claude Pepper and President Ronald Reagan in 1988. So bioinformatics has, in fact, been in existence for more than 30 years and is now middle-aged."

    Books: Can you recommend any bioinformatics books?

    It's notoriously difficult to find any books on bioinformatics itself that cater well for all of those coming from computing, from mathematics and from biology backgrounds. The few textbooks available in the field tend to be eyewateringly expensive as well. I've divided suggested reading into books of general interest, those best suited to people coming from a computational/mathematical background and books for biologists interested in bioinformatics. Where a book is also listed in Bioinformatics.Org's books section I have linked the title to the relevant entry there. Links to other lists of bioinformatics books follow this section of suggested reading.

    General introductions

    Many people are curious about the Human Genome (Project). The completion of the first draft probably represents bioinformatics' coming of age as a discipline. The first couple of books are aimed at the intelligent layperson.

    A gossipy and insightful account of the race to sequence the genome can be found in "The Sequence" by Kevin Davies [Weidenfeld; ISBN 0297646982]. Matt Ridley's "Genome" [Fourth Estate; ISBN 185702835X] is both an interesting layperson's introduction to the issues raised by the bioinformatic revolution and an overview of its biology and enormous scope. If I remember rightly, Ridley's book received a slightly snooty review from Walter Bodmer. This is understandable, since his and Robin McKie's excellent "pre-genomic" guide to the Human Genome Mapping Project, "The Book of Life" [Oxford Paperbacks; ISBN 0195114876] was undeservedly in a remainders bin when I bought my copy a couple of years ago.

    If you are a non-biological scientist (or a non-scientist) and are hooked by these, why not go back to the "real beginning" of the race and read James Watson's entertaining and indiscreet memoir of his and Francis Crick's determination of the structure of DNA, "The Double Helix" [Penguin; ISBN 0140268774]---now updated with an introduction by media don Steve Jones.

    Nigel Barber at Peterborough Regional College in the UK recommends Gary Zweiger's "Transducing the Genome" [McGraw-Hill Professional Publishing: ISBN 0071369805]. The summary at Amazon makes it sound a tad pretentious, but all the reviews seem pretty positive so it might be worth a read.

    If you are a quantitative scientist and would like a deeper knowledge of contemporary (molecular) biology, but you want to acquire it as painlessly as possible you could try the following:

    • Donna Rae Siegfried's Biology for Dummies [Wiley; ISBN 0-7645-5326-7] is fun, well thought out and a lot more informative than the title might suggest. If only all biology textbooks were this entertaining and unpretentious.
    • If you already have some biological knowledge and would like to get a grip on modern biomolecular science then Richard J. Epstein's Human Molecular Biology is an elegant, colourful and detailed guide.

    There are two classic competing texts in cell and molecular biology which Maximilian Haeussler reminds me to include: Alberts et al's Molecular Biology of the Cell [Garland Science: ISBN 0815340729] and Molecular Biology of the Gene [Benjamin Cummings: ISBN 0321248643].

    Computational/Mathematical aspects

    If you are a hardcore maths/computing person Michael Waterman's "Introduction to Computational Biology" [Chapman & Hall/CRC Statistics and Mathematics; ISBN 0412993910] and Pavel Pevzner's "Computational Molecular Biology - An Algorithmic Approach" [The MIT Press (A Bradford Book); ISBN 0262161974] will give you all the discrete maths you can shake a stick at, but perfunctory introductions to the biology.

    Bioinformatics.Org's very own Jeff Bizzaro recommends Dan Gusfield's "Algorithms on Strings, Trees and Sequences" [Cambridge, 1997 ISBN 0-52158-519-8], Richard Durbin, S. Eddy, A. Krogh, G. Mitchison "Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids" [Cambridge, 1997 ISBN 0-52162-971-3] (which I think is one of the clearest and most comprehensive guides to alignment algorithms) and---for that full "computers-to-biology conversion"--- Geoffrey M. Cooper "The Cell: A Molecular Approach" [ASM Press, 1996 ISBN 0-87893-119-8]. Jeff Ames writes that a second edition of this book is now available [Sinauer Associates, Incorporated, 2000 ISBN 0-87893-106-6] and that this version---if you can find it in the shops---comes with a CD.

    Applying bioinformatics to biological research

    One outstanding general text for the biologist is David W. Mount's "Bioinformatics" [Cold Spring Harbor Press; ISBN 0879696087]. It's not cheap, but it's the best I've seen if you are studying bioinformatics itself.

    Bioinformatics has been dismissed by some as "the science of BLAST searches". The best collection of advice so far on doing BLAST searches is O'Reilly's BLAST book by Ian Korf, Mark Yandell and Joseph Bedell [O'Reilly ISBN 0-596-00299-8]. I reviewed it enthusiastically, but not uncritically, for the UK UNIX Users' Group magazine. I'd go as far as to say that all biologists thinking of using BLAST in their research should read the relevant sections before they even go near a computer.

    If you wish to use general bioinformatics tools, especially if you are a little wary of computers, my new "best" book is "Bioinformatics for Dummies" [John Wiley and Sons ISBN 0764516965]. It is (obviously) aimed at people who are beginners, who are happier using the Web rather than typing commands, and who are more interested in learning than in impressing people---the writing is friendly clear and unpretentious. However, like several of my other tips (below) it concentrates on Web-based resources so it will, inevitably, date. (This is partially compensated for by there being a companion Website.)

    Also, if you're coming to the subject as a computer user with a biological background, looking to exploit the many tools available, you might want to try Terry Attwood and David Parry-Smith's "Introduction to Bioinformatics" [Longman Higher Education; ISBN 0582327881], or Des Higgins and Willie Taylor's "Bioinformatics: Sequence Structure and Databanks" [Oxford University Press; ISBN 0199637903]. Another excellent practical introduction is Andreas Baxevanis and Francis Oulette's "Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins" [Wiley-Interscience; ISBN 0471383910], now in its new and improved second edition. Bax teaches bioinformatics all over Canada and the experience shows. Arthur Lesk has also produced an excellent teaching book particularly for protein bioinformatics in his Introduction to Bioinformatics

    Bioinformatics.Org also recommends Cynthia Gibas and Per Jambeck's "Developing Bioinformatics Skills" [O'Reilly, 2001 ISBN 1-56592-664-1].

    Stuart Brown recommends his own book "Bioinformatics: A Biologist's Guide to Biocomputing and the Internet" [Eaton Pub Co; ISBN: 188129918X]. If he sends me a review copy I might recommend it too ;-) .

    Fiction books

    "Darwin's Radio" by Greg Bear [Ballantine Books, ISBN: 0345435249] is a wonderful hard SF thriller which stretches ideas derived from genome discoveries to their breaking point. It's gripping and humane.

    Leonard Crane, the author of Ninth Day of Creation kindly sent me a copy for review. So far it's an excellent read. I haven't finished it yet, not because it isn't a rattling good story, but because, like "Darwin's Radio", it is very long and because I am very busy. If you'd like to read a well-researched, but speculative, novel containing actual scenes of practising bioinformatics then try it.

    Ken Allen contributed the following reviews:

    "Frameshift [Tor Books, ISBN: 0812571088] by Robert J. Sawyer---based around the HGP---reasonable read, but poor / confused ending."

    Calculating God [Tor Books, ISBN: 0812580354]by the same author---has a subtler bio connection and is a much better read. Near the start an alien spacecraft lands, the alien emerges and says 'take me to your paleontologist'

    Further suggestions for this section are welcome.

    Other lists of bioinformatics books

    See also compbiology.org's list, Steve Brenner's list, and Aik Choon Tan's collection of books.

    Centres of Bioinformatics Activity: Where is bioinformatics done?

    The biggest and best source of bioinformatics links I have encountered is the Genome Web at the Rosalind Franklin Centre for Genomics Research at the Genome Campus near Cambridge, UK. Most of the links below come from that resource. My list is necessarily limited by comparison.

    Research centres

    Sequencing centres

    [XXXX INSERT DETAILS OF MORE SEQUENCING CENTRES HERE]

    Standards centres

    [XXXX INSERT DETAILS OF STANDARDS CENTRES HERE]

    What virtual centres (for example consortia and communities) for bioinformatics activity are there?

    [XXXX INSERT MORE DETAILS OF VIRTUAL BIOINFORMATICS CENTRES HERE]

    Online Resources: What bioinformatics Websites are there?

     

     

    Tutorials

    A great place to start, whether you come from a biological, physical or computational background is at Martin Vingron's superb online bioinformatics tutorial. (Begin by choosing a section from the left-hand-side menu bar.)

    Tom Smith and Don Emmeluth have produced a nice little exploration of bioinformatics using NCBI resources and tools.

    I recently stumbled upon a promising set of online lecture notes currently under construction by B. Steipe at the Genzentrum (Gene Center) at the Ludwig-Maximilians-Universität München (University of Munich).

    Chemistry for all

    A defiantly frames-free chemistry tutorial site.

    Mathematics for biologists

    First of all, an almost completely painless introduction to the horrors of the quadratic equation by Peter Whalen, James Walker, and Drew Marticorena.

    C. J. Schwarz of the Department of Statistics and Acturial Science, Simon Fraser University has produced a course in statistics which is accompanied by set of sound, online PDF handouts.

    Here is a great guide to a whole array of statistical learning/teaching resources prepared by Juha Puranen of the University of Helsinki (English).

    Computers for biologists

    Programming for biologists

    General introduction to biology for computer scientists

    Estrella Mountain Community College in the States offers this excellent short introduction to biology (actually "The Nature of Science and Biology". It's a great place for keyboard jockeys to start their journey to enlightenment. Thanks to Alex O'Neill for pointing out the broken link.

    Genetics

    The Dolan DNA Learning Center at Cold Spring Harbor has an outstanding interactive tutorial introducing genetics. To take full advantage of the multimedia elements you should download the Flash and Real players.

    Molecular biology for computer scientists

    The Institute of Arable Crop Research Beginner's Guide to Molecular Biology

    Protein chemistry for computer scientists

    Unilever Education Advanced Series tutorial on proteins.

    Cell biology for computer scientists

    The University of Arizona has made available a high-quality tutorial in cell biology. Not only does it cover the facts, but it also attempts to introduce some of the philosophy of the field---recommended. Even better, it's also available en Español and in Italiano.

    Once you've worked your way through that you might like to see some scanning electron microscope images of some of the structures you've read about taken by members of John Heuser's lab.

    Evolution for computer scientists

    Bob Patterson maintains his "Darwiniana" with amazing diligence.

    Practical bioinformatics

    Other lists of bioinformatics tutorials

    Education: Where can I study Bioinformatics...

    jump straight to introduction to education section

    This section is not complete, but contributions to broaden its coverage are welcome. Please do not direct questions about eligibility, course quality or admissions policy to me, but to ask the individual institutions directly. Use the links to obtain contact details. If an institution doesn't provide telephone numbers/email addresses or snailmail details on its Web site it doesn't deserve your patronage.

    This resource focuses on complete, full-time degree programmes rather than on individual study modules. Curating a list of the latter would be a full-time job. You can go to other places, however, if you are looking for short courses. Thanks to various contributors, including Wentian Li who pointed me to this list at Rockefeller which is mirrored at various other sites. And to Humberto Ortiz Zuazaga for mailing me a link to the ICSB, where you can find this list.

    If you are interested in U.S. programmes, here's a list from Curtin and here's a list from Stanford. Thanks to Amelie Stein who also supplied some of the individual entries in this section.

    Those wanting to find programmes in the Asia Pacific region could have a look at this resource maintained by the Asia Pacific Bioinformatics Network APBioNet. Thanks to Sentausa.

    In the UK The Bioinformatics Resource (part of the BBSRC's CCP11 project) project maintains (among many other resources) lists of (mainly) British Masters and PhDs in bioinformatics. If you have any suggestions or updates please contact me with them. You can publicize your course and offer a public service at the same time.

    Africa

    Rhodes University, Grahamstown, South Africa offers an MSc. in Bioinformatics and Computational Molecular Biology. Thanks to Natalie Twine.

    Cathal Seoighe wrote a while back about the South African National Bioinformatics Institute (SANBI). Ruediger Braeuning has since written to point out that bioinformatics training in South Africa has been radically reorganized. He says:

    "A new institute, the National Bioinformatics Nework (NBN), has been created. We have nodes at Universities all over the country (UWC, UCT, SUN, RU, UKZN, UP, WITS). Our main tasks are to:

    • develop capacity in Bioinformatics
    • perform world-class research
    • support local Biotechnology initiatives

    "We do offer courses on various topics in Bioinformatics ranging in length from 3 days to several weeks. We also train Bioinformaticists on MSc, PhD and post doc level. Undergraduate programs are currently being developed. Bursaries are available. For more information visit our Website."

    South African National Bioinformatics Institute (SANBI) Honours Bioinformatics Course at the University of the Western Cape. Next year the same institute will be offering a Master's in bioinformatics---thanks to Cathal Seoighe.

    If you know of any other bioinformatics courses on the African continent please feel free to mail me about them.

    The Americas

    Brazil

    According to Pablo Nehab-Hess the Laboratório Nacional de Computação Científica (LNCC), Brazil and the Universidade Federal do Rio de Janeiro (UFRJ) recently created a joint Bioinformatics MSc programme, through the Genetics Department of UFRJ and the Department of Applied Computational Mathematics of LNCC.

    Canada

    Thanks to Jordan Patterson for the information that the University of Alberta offers four-year Biology or Computer Science degrees with a specialization in bioinformatics. The Faculty of Computer Science there offers Master's and PhD training in bioinformatics.

    Benjamin Horsman wrote to tell me that Simon Fraser University and the University of British Columbia are collaborating on a new Bioinformatics training program with the British Columbia Cancer Agency. The program offers post-graduate diploma, Master's, and PhD training in Bioinformatics. Now Simon Fraser University also offers a joint major programme in Molecular Biology and Biochemistry (MBB) and Computer Science in Bioinformatics. Thanks to Brittany Nielsen for the info.

    Thanks to Olga Likhodi for the information that Seneca College, Toronto offers a post-graduate diploma in Bioinformatics.

    Peter Kublik informs me that from 2003/2004 the University of Calgary will offer a bioinformatics programme. He's part of the first intake.

    The University of Waterloo, Department of Computer Science offers undergraduate and graduate courses in bioinformatics. More information is here.

    California

    The Keck Graduate Institute claims that computational biology is a core element of the curriculum in its Master of Bioscience degree.

    Stanford University offers academic and professional (distance-learning) MSs in Biomedical Bioinformatics as well as its PhD programme. Thanks to Betty Cheng.

    Thanks to Momchil Georgiev for the information that the University of California at San Diego offers a Bioinformatics graduate programme and to Dana Brehm that there is now a new bachelor's program, to quote her:

    "[This is an] undergraduate, interdisciplinary program for undergraduates leading to a B.S. degree. The new Bioinformatics major is offered by the Division of Biology, and the departments of Chemistry/Biochemistry, Computer Science and Engineering, and Bioengineering. A student may choose to major in Bioinformatics in any one of the four departments or division. The Division of Biology currently offers two Bioinformatics courses, and with the advent of the cross-disicplinary major, even more courses are going to be taught 2002-03 and 2003-04."

    University of California, Irvine Informatics in Biology and Medicine

    David Delong wrote to me to point out that the College of Natural and Agricultural Sciences at the University of California, Riverside is developing a "Center in Genomics and Bioinformatics" which will offer a PhD curriculum in genomics and bioinformatics from academic year 2001-2002 onwards.

    Catherine Velazquez says that The University of California, Santa Cruz offers a new undergraduate BS course in bioinformatics. They have a Frequently Asked Questions. Now they also offer an MS/PhD in Bioinformatics. Thanks to Kevin Karplus for the update.

    Connecticut

    Javier Rojas Balderrama emailed me to point out thatYale University offers a Bioinformatics and Computational Biology track as part of its combined Biological and Biomedical Sciences graduate programme.

    Georgia

    Georgia Institute of Technology Masters of Science in Bioinformatics

    According to Eric VanWieren Georgia State University offers a Master's and PhD in Computer Science with a focus on bioinformatics. The university's Bachelor of Science in Computer Science also offers a "Fundamentals of Bioinformatics" course.

    Illinois

    The University of Illinois at Chicago offers graduate programmes covering Bioengineering Bioinformatics through its Bioengineering department as well as an undergraduate course track. Thanks to Amit Sabnis.

    Indiana

    IUPUI offers an MS programme in Bioinformatics.

    Indiana University also offers an MS programme in Bioinformatics.

    Iowa

    Iowa State University offers an Interdisciplinary Ph.D. Program in Bioinformatics and Computational Biology (BCB).

    Maine

    The Jackson Lab, a World centre of mouse genome informatics offers a graduate training program.

    Maryland

    Tim Young wrote to say that Johns Hopkins University in Maryland offers an MS in Bioinformatics through the Zanvyl Krieger School of Arts and Sciences Advanced Academic Programs and Whiting School of Engineering Engineering and Applied Science Programs for Professionals. They are also offering a Bioinfomatics concentration with their MS in Biotechnology program.

    Massachusetts

    Boston University offers a graduate programme and so does its partner North Eastern University. North Eastern also offers a Graduate Certificate in the subject.

    Brandeis University offers both a Master of Science in Bioinformatics and a Graduate Certificate in Bioinformatics. Thanks to Matt Foster.

    The Department of Computer Science at UMass Lowell offers various degrees from Bachelor's through to PhD. level in Computer Science with Bioinformatics options.

    Mexico

    At the National Autonomous University of Mexico a doctoral program in biomedical sciences is available. Their Computational Molecular Biology Group is here.

    Minnesota

    The University of Minnesota offers a graduate programme in bioinformatics. Thanks to Lynda Ellis for the up-to-date link.

    Thanks to Anu Haniharan for drawing my attention to mixing up the Minnesota and New Jersey paragraphs.

    Nebraska

    The University of Nebraska Lincoln offers an Interdisciplinary Bioinformatics Specialization.

    The Graduate Program of the Pathology-Microbiology Department at the University of Nebraska Medical Center (University of Nebraska at Omaha) offers a specialty track in bioinformatics.

    NewJersey

    Rama Penta wrote to say that Stevens Institute of Technology offers a Master's programme in Bioinformatics.

    The message also states that the University of Medicine and Dentistry New Jersey (UMDNJ) offers a programme in biomedical informatics.

    Thanks to Anu Haniharan for drawing my attention to mixing up the Minnesota and New Jersey paragraphs.

    Moustafa wrote to say that Ramapo College in New Jersey is the only school in New Jersey offering a Bachelor's degree in bioinformatics.

    New York State

    The University at Buffalo has been involved in establishing a "Center of Excellence in Bioinformatics". It used to a range of courses in bioinformatics and related subjects, but all the course links seem to be dead now. Thanks to Jeff Ligas for the original notification.

    Canisius College---also in Buffalo, NY---has had a state-approved B.S. in Bioinformatics since 2001. Thanks to Deb Burhans.

    Cornell and Rockefeller Universities, together with the Sloan-Kettering Research Institute offer a "Tri-institutional program in Computational Biology and Medicine". Thanks to Brant Inman.

    Since September 2003 Farmingdale State University of New York has offered a unique baccalaureate Bioscience curriculum including bioinformatics as one of its concentrations. Thanks to Charles Adair for this information.

    Polytechnic University in Brooklyn offers a graduate programme in bioinformatics. Thanks to Bulat K.

    Rensselaer Polytechnic Institute offers both undergraduate and graduate programmes in bioinformatics

    Rochester Institute of Technology offers BS MS and BS/MS programmes in Bioinformatics. Thanks to Brandon H.

    According to Maureen Downey, the College of Staten Island, part of the City University of New York also offers a challenging program in bioinformatics.

    If you know of any other bioinformatics courses on the American continent please feel free to mail me about them.

    North Carolina

    Duke University's Center for Bioinformatics and Computational Biology offers various bioinformatics programmes.

    The North Carolina State University Statistical Genetics and Bioinformatics Program offers Master's Bioinformatics and PhDs in bioinformatics.

    The University of North Carolina at Chapel Hill offers a programme in Bioinformatics and Computational Biology (BCB).

    Ohio

    Andrew Johnson writes: "There is a relatively new Biomedical Informatics program in Ohio. (I'm entering the program in a few months). Though the department stands alone, it is in the College of Medicine at the Ohio State Medical Center. Entrance is offered through a new Integrated Biomedical Sciences Graduate Program.".

    Pennysylvania

    The University of Pennsylvania offers some of the best known and longest established bioinformatics programmes at Batchelor's, Master's and PhD levels. Thanks to Louis Licamele for pointing out my oversight (I just assumed I'd already listed them!) He also points out that Georgetown University is planning bioinformatics courses too.

    Texas

    Tom Andrews, a student on the course, has written to me to tell me that Texas A&M University at Corpus Christi is currently offering a BS computer science degree in bioinformatics.

    Jeremy Read told me that St. Edward's University in Austin offers a B.S. in Bioinformatics.

    The Keck Center for Computational Biology---a joint venture of Baylor College of Medicine; University of Houston; Rice University; University of Texas Health Science Center, Houston; M.D. Anderson Cancer Center; and University of Texas Medical Branch, Galveston---offers undergraduate (not 2003) and graduate level training in Computational Biology.

    The University of Texas, El Paso offers a Master's in Bioinformatics.

    Virginia

    George Mason University offers both M.S. and PhD. programmes in Bioinformatics.

    The Virginia Polytechnic Institute and State University's Bioinformatics Institute offers graduate options in Bioinformatics. Thanks to William S. Preissner for correcting this entry.

    Asia

    Hong Kong

    Raymond Lau drew my attention to the Bachelor of Science degree in bioinformatics at the University of Hong Kong.

    India

    Niranjan Swaroop Sharma wrote to tell me about the Bioinformatics Institute of India which is offering a whole range of bioinformatics programmes and qualifications in both regular and distance learning formats. I would have reported on this earlier, but have not been able to view the site in Mozilla. I finally viewed the site using Konqueror today (24Jul03). Perhaps some tinkering with the ASP code is needed there...

    Vaibhav Sinha wrote to tell me that the Institute of Bioinformatics and Applied Biotechnology (IBAB) in Bangalore is offering bioinformatics courses.

    Thanks to Surjeet Singh for drawing my attention to the Indian Institute of Information Technology-Allahabd which runs a Master of Technology (M. Tech Bioinformatics) degree.

    According to Rahul Agrawal, the Indian Institute of Technology Delhi, New Delhi provides courses in Biochemical Engineering and Biotechnology. He adds that another branch of the Institute, IIT Kharagpur also provides various courses in this area.

    There is an Advanced (Graduate) Diploma in Bioinformatics in the Bioinformatics Centre at the Jawaharlal Nehru University.

    Madurai Kamaraj University in Madurai, India claims to have been the first in the country to initiate a bioinformatics programme and advanced diploma in bioinformatics at its School of Biotechnology

    Risabh Bhandari writes to say:

    "The recently rechristened CBT (Center for Biochemical Technology) [link dead 13Nov02] which is a CSIR Lab [in New] Delhi has started a PG Diploma in Bioinformatics in association with Informatics institute. The course covers a large area in the field with [its] primary focus on computational and programming concepts. The course is 6 months in duration, [and] conducted at the national Head office of [the] Informatics institute."

    The University of Pune, Maharashtra offers its MSc. in Bioinformatics and Advanced Diploma in Bioinformatics at the Bioinformatics Centre, India.

    Uma Paresmeswaran wrote to say that SASTRA, which is based near Trichy, Tamil Nadu, will be offering a B.Tech.Programme in Bioinformatics from 2003/2004, the first institute in India offering this course at the undergraduate level?

    There is, according to Aditi Arur, an MSc distance education program in Bioinformatics, offered by Sikkim Manipal University India.

    Sugandha Singhal wrote to mention the undergraduate and graduate programmes in bioinformatics at Vellore Institute of Technology in Tamil Nadu and the undergraduate programme in bioinformatics at Amity Institute, NOIDA.

    Malaysia

    Dr Amir Feisal Merican wrote to say that the Institute of Biological Sciences, Faculty of Sciences, University of Malaya, Kuala Lumpur, is offering a BSc (Bioinformatics) undergraduate degree programme. Yam confirmed this that this degree has been taught for 3 years.

    Alfred Simbun suggested three more Malaysian universities offering bioinformatics degrees: Universiti Industri Selangor (UNISEL), Kolej Universiti Teknologi & Pengurusan Malaysia (KUTPM) and Universiti Kebangsaan Malaysia (UKM)

    Kebangsaan University, Malaysia (UKM) will start to offer a Bachelor's Degree in Bioinformatics to its next intake, in July, 2003.

    Pakistan

    Thanks to Abdul Hameed for pointing out that two universities in Pakistan---COMSATS Institute of Technology and the Mohammad Ali Jinnah University---will be offer four-year Bachelor of Sciences degrees in bioinformatics from September 2003.

    Singapore

    The Bioinformatics Centre of the National University of Singapore offers Undergraduate and PhD programmes in conjunction with the life sciences departments and research institutions at NUS.

    Lam Ah Wah wrote to tell me that the Nanyang Technological University (NTU) starts a BioInformatics undergraduate and part-time post-graduate MSc course in Jul 2002. Be warned: their Web site has hideous frame/window based "portal" which breaks half a dozen rules of good interface design. Chua Hian Koon managed to find a better link, and I browsed from there to the syllabus here.

    If you know of any other bioinformatics courses is Asia please feel free to mail me about them.

    Australasia

    Australia

    The Research School of Biological Sciences, at the Australian National University in Canberra offers PhD., MSc. and Honours programs in Bioinformatics.

    You can obtain a Graduate Certificate in Bioinformatics from Curtin University of Technology in Western Australia.

    As of 2001 Flinders University in Adelaide offers a Bachelor's of Science in Bioinformatics.

    The Biochemistry Department of La Trobe University in Victoria also offers an undergraduate course in Bioinformatics.

    The University of Melbourne offers undergraduate study in Bioinformatics. Thanks to Gad.

    There are (according to H L View) PhD, MPhil and Honours programmes in bioinformatics (plus a bioinformatics minor) available at Murdoch University's Centre for Bioinformatics and Biological Computing.

    Rachel Oh said that is possible to study a near-bioinformatics programme at QUT (Queensland University of Technology): the B. Sci (biotech maj.) & IT (in software engineering & data comms) IF29. A copy of the course is available by searching their Website.

    The University of New South Wales in Sydney offers a Bachelor of Engineering in Bioinformatics.

    According to Jonathan Watts, "Queensland University of Technology in Brisbane QLD offers a Bachelor of Applied Science Innovation, with a major in Bioinformatics" from 2004.

    Sydney University in New South Wales offers a Bachelor's of Science and a postgraduate, Master of Applied Science degree in Bioinformatics. Thanks to Dominic Lau and Sebastien Gerega or the update.

    If you know of any other bioinformatics courses is Australasia please feel free to mail me about them.

    New Zealand

    Thanks to Danushka for the information that the University of Auckland, New Zealand has a BSc (Hons) in bioinformatics.

    Europe

    Austria

    A bioinformatics option is offered as part of degree courses at the Graz University of Technology (Technische Universität Graz) in Graz, Austria.

    Belgium

    A consortium including nearly all the French-speaking universities of Belgium (Bruxelles, Liège, Louvain, Mons, Namur and Gembloux) is offering the "Inter-University DEA/DES (Master) in Bioinformatics".

    The Department of Engineering at the Katholieke Universitiet of Leuvan offers a Master of Bioinformatics degree.

    Denmark

    The Bioinformatics Centre at The University of Copenhagen offers a two-year masters program in bioinformatics. Thanks to Thomas Litman.

    The Technical University of Denmark, Center for Biological Sequence Analysis offers a two-year International MSc. in bioinformatics.

    Syddansk Universitet (The University of Southern Denmark) offers both BSc- and MSc- level Bioinformatik / Experimental Bioinformatics. Thanks to Fiona Nielsen for the updated link---"Center for Experimental Bioinformatics".

    Finland

    The Finnish Graduate School in Computational Biology, Bioinformatics, and Biometry or "ComBi" is a joint venture of the University of Helsinki (English), the University of Turku (English) and the University of Tampere (English).

    France

    Fabio Pardi writes that the Université Paris VII offers a DEA en Analyse de Génomes et Modélisation Moléculaire. Thanks to Brant Inman again for this link to the course.

    Isabelle da Piedade kindly provided this list of Master's and PhD programmes in France:

    Germany

    Thanks to Amelie Stein for several of these entries.

    The Technische Fachhochschule Berlin (University of Applied Science) offers an MSc in Bioinformatics and the Freie Universität Berlin (Free University) offers both an MSc. and a BSc. in Bioinformatics. Thank you to Sebastian Kurscheid for this information.

    Alexandra Reitelmann wrote to say that Bonn-Aachen International Center for Information Technology (B-IT) is offering a new English-language Master's programme in Life Science Informatics. The B-IT is a joint venture between the University of Bonn, the RWTH Aachen University, the University of Applied Sciences Bonn Rhein-Sieg, and Fraunhofer Institutszentrum Birlinghoven Castle (IZB).

    The Institut für Informatik at Johann Wolfgang Goethe-Universität Frankfurt am Main offers a programme in Bioinformatik.

    The Fachhochschule Bingen also offers a bioinformatics degree. Thanks to Manuel Schmidt.

    Bioinformatics can be studied at the Fachhochschule (University of Applied Sciences) Oldenburg/ Ostfriesland/Wilhelmshaven. Thanks to Gerd Klaassen.

    Bioinformatics is taught at Friedrich-Schiller-Universität, Jena. Thanks to Lisa Mullan for the updated link.

    The Interdisziplinäres Zentrum für Bioinformatik at the Universität Leipzig teaches Bioinformatik.

    You can do a PhD in bioinformatics in the Department of Computational Molecular Biology at the Max Planck Institute for Molecular Genetics. Thanks to Martin Okrslar---and to Pooja Jain for the correction to my broken link.

    The Technische Universität München and Ludwig-Maximilians-Universität München also offer Bioinformatik.

    The Universität Tübingen (University of Tübingen) also offers Bioinformatik. Here are their own Frequently Asked Questions (in German only) about studying bioinformatics there.

    Tobias Kailich kindly pointed out that FH Weihenstephan in Freising (near Munich) offers opportunities to study Bioinformatik / Bioinformatics.

    Ireland

    Conor Meehan wrote to say that the National University of Ireland Maynooth set up a four-year Batchelor's course in Computational Biology and Bioinformatics two years ago.

    Israel

    Ben Gurion University, Beer Sheva offers places on the Bioinformatics Track to a select few of its admitted students to the School of Computer Science.

    Tel Aviv University offers a BSc. in Bioinformatics. Thanks to Racheli Zakarin for the link.

    The famous Weizmann Institute in Rehovot teaches an MSc. called "Multidisciplinary Program in Computational Biology and Bioinformatics". This PDF document has more information. Gad Abraham, who told me about this, points out that "all studies there are conducted in English and that there are no tuition fees"

    The Netherlands (Holland)

    The Centre for Molecular and Biomolecular Informatics (CMBI) at the University of Nijmegen offers a Master's degree in bioinformatics. This is a one or two year course leading to a degree with the formal title of "Master in Life Sciences", but the subtitle "Bioinformatics".

    Norway

    The Institutt for informatikk (Department of Informatics) of the University of Bergen, Norway offers a Master's degree in bioinformatics.

    Wageningen University offers MSc courses in Bioinformatics. Thank you to Judith Risse.

    Portugal

    There is a post-graduate programme in bioinformatics organized by the Instituto Gulbenkian de Ciência (IGC) and the Faculty of Sciences of the University of Lisboa. (Thanks to Pedro Fernandes.)

    Francisco Rocha wrote to say that Escola Superior de Biotecnologia (ESB) teaches a bioinformatics programme [follow the link labelled "Bioinformática"] in both Lisbon and Oporto. The teaching institution is the Universidade Católica do Porto.

    Sweden

    Bjorn Olsson writes that, as well as a 4-year Master's Degree in Bioinformatics, the University of Skövde offers a number of short courses and allow computer science master's students to include bioinformatics in their degree. There is more information here.

    Daniel Nilsson drew my attention to the MSc in Bioinformatics Engineering in Uppsala. Thanks to Erik Kanders for correcting the link.

    There are also opportunities to study bioinformatics on the "normal" biotech courses in Gothemburg Linköping and Umå.

    The Stockholm Bioinformatics Centre, Stockholm University, offers PhD-level shorter courses in bioinformatics subjects.

    The School of Mathematical and Computing Sciences at Chalmers offers undergraduate and Master's programmes in bioinformatics. Thanks to Samuel Hargestam.

    Switzerland

    Fabio Pardi wrote that the Swiss Institute of Bioinformatics offered a Master's degree (DEA). It was a collaboration between the Swiss Institute of Bioinformatics and three faculties of the Universities of Geneva and Lausanne. According to Javier Rojas Balderrama this programme is now closed.

    United Kingdom

    In 2002 I prepared a review of bioinformatics education in the UK for the journal Briefings in Bioinformatics. The article ends with a detailed listing of all current and some future undergraduate and graduate courses in bioinformatics the UK as of September 2002, along with links. You can read a preprint here.

    Bioinformatics is among the specialisms available on Aberdeen University's MSc/PgDip Information Technology.

    The University of Abertay, Dundee has an MSc./PG Dip in Bioinformatics. Thanks to Dr Nagesh.

    Birkbeck College is a British centre with a proud tradition in educating working and/or mature students to the highest academic standards.

    The University of Birmingham and UMIST offers undergraduate courses in bioinformatics.

    Cambridge University is planning an MPhil in Computational Molecular Biology to start in 2004-2005. Thanks to Antony Quinn for the reminder.

    In October 2004, Cardiff University started two different courses: Bioinformatics or Genetic Epidemiology and Bioinformatics either full-time or part-time and at MSc/PG Cert or Diploma level. Thanks to Ian Brewis, who pointed out that Cardiff's programme is distinguished by offering students a stronger thread of genetic epidemiology for those students interested in this.

    Cambridge University is planning an MPhil in Computational Molecular Biology to start in 2004-2005. Thanks to Antony Quinn for the reminder.

    In April 2002 City University's Bioinformatics group moved to the University of Glasgow Department of Computer Science. . Thanks to Will Bachelor for alerting me to the existence of this group. City still offers MScs in Pharmaceutical Information Management and Health Informatics

    Cranfield University at Silsoe offers an MSc. in Bioinformatics.

    Hussein Zedan pointed out that De Montfort University, Leicester was going to start its MSc. in Bioinformatics in September 2003 in both full- and part-time formats.

    The University of East Anglia offers an MSc. in Bioinformatics. Thanks to Dr Nagesh.

    Edinburgh University, offers an MSc./Diploma in Quantitative Genetics and Genome Analysis and an MRes (MSc./Diploma by Research) in Life Sciences in which you can specialize in Quantitative trait analysis and genomics .

    There are various graduate programmes offered by the University of Exeter MSc/MRes/PgCert/PgDip in Bioinformatics. (Thanks to M Antro for an update.)

    The University of Glasgow offers an MRes in Bioinformatics.

    In November 2004, Fiona Croll alerted me to Herriot-Watt University's Bioinformatics (IT) MSc jointly taught by the university's School of Mathematical and Computer Sciences and its School of Life Sciences.

    Imperial College offers a new MSc in Computational Genetics and Bioinformatics and MRes Biomolecular Sciences courses.]

    There are MRes studentships available on the courses at Leeds University.

    On 20Jan03 UKeU, the UK government-backed company set up to provide online degrees from UK universities to students worldwide, announced a new Master's level programme in Bioinformatics from the Universities of Leeds and Manchester. (Thanks again to Jo Wixon for this.)

    University of Liverpool M.Sc., Postgraduate Diploma and Postgraduate Certificate in Biosystems & Informatics

    Manchester University also teaches bioinformatics to its undergraduates as well as offering a taught MSc. course in the subject.

    Newcastle University's MRes in Bioinformatics began in September 2003.

    The University of Nottingham's undergraduate biochemistry degrees feature bioinformatics prominently.

    Oxford University has a Master's degree course with an interesting flexible structure. Thank you to Helen Parkinson and Clare Hayes for this information.

    Thank you to David Parkinson (no relation to Helen, above) for pointing out to me that for the past two years Sheffield Hallam University has offered an MSc/PGDip in Bioinformatics at its Graduate School in Science, Engineering and Technology.

    The University of Sheffield Centre for Bioinformatics and Computational Biology offers taught courses related to bioinformatics.

    Rafiu Fakunle emailed to tell me that Queen Mary, University of London offers an undergraduate degree in bioinformatics.

    Royal Holloway College in the University of London offers an MSc. in Computer Science by Research in which a bioinformatics specialism is available.

    University College London (UCL) offers a final year undergraduate course: "Bioinformatics:Genes, Proteins and Computers".

    Together with Harrow School of Computer Science, The University of Westminster, a new university in London, offers an MSc. in Bioinformatics as both a full- and part-time course. Again this is aimed primarily at graduates of the biological sciences.

    York University's Department of Biology offers Masters courses and PhDs in both computational biology and biomolecular science.

    If you know of any other bioinformatics courses in Europe please feel free to mail me about them.

    ...Remotely (Distance/Correspondence Courses)

    Many visitors to the FAQ ask about bioinformatics distance learning. Eventually I will try to gather together all those courses on this list that can be taken remotely---if I ever have the time. Unfortunately I don't at the moment. All I can suggest is that you examine the courses yourself through the links provided in the FAQ. Many can be taken over the Net or offer components that can be studied at a distance. (And, if you do compile such a list for yourself, do please email it to me and I will post it here for the benefit of our users with, as usual, a full credit for your efforts.)

    If you are thinking of studying at a UK institution you might want to search through the pre-print of my review of UK bioinformatics education for the word "distance". At the moment I think the courses at Birkbeck, Exeter and Oxford offer either full or part distance learning options.

    Careers: How can I become a bioinformatician?

    How can I get involved?

    If you want to get involved in bioinformatics, now is an exciting time, but (certainly for less senior practitioners) it looks as though demand for bioinformaticians is currently falling, partly for general economic reasons, partly, perhaps, because drugs companies in particular have been disappointed with the pay-off from their investment in the field.

    This section is opinionated; there are people in the field, both computer scientists and biologists, who I would love to provoke (or convert). If you are a newcomer, and especially if you come from one of bioinformatics component pure disciplines, I hope my ranted warnings will help you to avoid the mistakes of your predecessors---and I write as one of the mistaken. David S. Roos put it well in his review in the journal Science:

    "Lack of familiarity with the intellectual questions that motivate each side can also lead to misunderstandings. For example, writing a computer program that assembles overlapping expressed sequence tags (EST) sequences may be of great importance to the biologist without breaking any new ground in computer science. Similarly, proving that it is impossible to determine a globally optimal phylogenetic tree under certain conditions may constitute a significant finding in computer science, while being of little practical use to the biologist."

    How can I get involved?---I am a "newbie"

    Please read the education section above for information about some of the places you can currently study bioinformatics. Please do not direct questions about eligibility, course quality or admissions policy to me, but to ask the individual institutions directly.

    If you are a high school student / sixth former, think about taking an interdisciplinary computational biology or bioinformatics bachelor's degree of the sort offered at, for example, Manchester University in the UK or UPenn in the States. Don't worry if you can't find a place on such a course or there isn't one nearby; perhaps the best way to approach this subject is from two sides. Do a bachelor's degree in one area while taking a healthy interest in the other---or (if you can afford to) complement a first degree in one part of the discipline with a second degree in the second.

    If you already have a degree in a biological discipline there are similar Master's courses---both interdisciplinary (e.g. Birkbeck's in London) and conversion type courses---for biologists or others to learn computer science, for example.

    If you are currently doing a computer science or biology PhD, try to take advantage of the opportunity to take courses in the "other" discipline.

    How can I get involved?---I am a biologist

    To a biologist I would say: take as many real computing courses as you can. It's important not just to learn a programming language, but also to learn the discipline of computing; to structure and document your work in a rigorous way. What courses you take might be directed by the kind of work you are interested in doing when you graduate---whether you see yourself supporting bioinformatics applications or building them. For the former you need all-round familiarity with the programs themselves and the hardware and software needed to run them---plus your existing understanding of biology. For the latter you need to learn a structured programming language and the principles of good program design---plus the ability to talk to and understand biologists.

    Courses biologists might consider taking:

    UNIX

    Of all the computing courses available it is most important that you have a proper introduction to the UNIX operating system(s). Most current bioinformatics software (especially the free stuff) runs on "open" platforms like Linux and the Web. The UNIX philosophy is elegant, powerful, and frustrating. Master it and you will save a lot of time.

    Mathematics

    Learn some maths. Basic statistics, logic/set theory and a little calculus would be my recommendation. Many practising biologists have little or no grasp of elementary concepts like statistical significance, permutations and combinations and the principles of good experimental design. Logic will come in handy at the very least if you want to query databases in an intelligent way.

    Programming

    If you're interested in development, learn a real programming language: Pascal, C(++), Java or Fortran.

    Perl and HTML are the stuff that holds the Web together. A grasp of these is essential for a lot of the Web/database work being done by many bioinformaticians at the moment.

    Good old BASIC can be very useful as an introduction to programming or as a tool in its own right, but none of these latter languages is built to crunch numbers and tackle real world biological problems---which isn't to say people don't try...

    How can I get involved?---I am a computational/quantitative scientist

    One thing that I will emphasise repeatedly in this section is the simple value of doing some "proper" biological laboratory science. I have sat through many talks during which a bioinformatics "scientist" describes in great detail how his---it's usually "his"---application of a trendy mathematical tool offers a supposed insight into a (sometimes supposed) biological problem. Nine times out of ten I know that this method will never be so much as sneezed on by a practising biologist.

    Quantitative scientists sometimes talk about their interest in studying some aspect of "God's mind". Biologists, in contrast, are interested in "Mother Nature". You might meditate on God in the hope of some revelation, but to understand Nature you have to meet her in the flesh. You are as likely to be useful to biologists working in isolation at the keyboard as you are to conceive with your clothes on. Desk-bound bioinformaticians have written code that has turned out to be popular with biologists, but almost always because they have collaborated with biologists.

    Courses quantitative scientists might consider taking:

    Molecular biology

    "MoBi" was the bioinformatics of its day; desperately fashionable, the province of new, higher-paid practitioners and considered with slight suspicion by more traditional biologists. It was once a great achievement to sequence a modest stretch of DNA, now it's a job for robots. Today the technology of molecular biology is very well established. Scientists can buy kits to perform the sort of genetic manipulations that would make your parents' jaws drop. Some of the kits are so simple your small children could use them (with a modest amount of training and supervision).

    Despite the profusion of commercial kits, there is still a requirement for real skill in molecular biology and the general level of scientific understanding required to be a good biological scientist---rather than just completing a practical class---doesn't come easy. Living matter, the stuff you have to work with is unpredictable and responds slowly---except when it's dying. Even supposedly fast-growing bacteria can take a long time to yield up their secrets.

    Now, fashions in biomedical research are shifting from molecular biology back to cell biology and protein biochemistry, but it's well worth offering yourself up as a volunteer for some vacation work in a molecular biology lab. The term is now more often used to refer to the technological tools provided by MoBi to biology in general, rather than to fundamental research in the field itself. Those tools are common to a vast array of different kinds of research, from archaeology to zoology.

    Protein (bio)chemistry

    Protein (bio)chemistry is experiencing a revival. Proteins are still more delicate and fussy than nucleic acids. The same advice that applies to molecular biology applies to protein biochemistry. That stuff bioinformatics people refer to as "wet lab science" is much harder than it looks.

    You might find it more difficult to get access to a good protein lab than a good molecular biology lab and do protein science with real wizards, but the very least you can do is read about the theoretical aspects of the subject.

    For insights into the principles of proteins structure, try, for example, Carl Branden and John Tooze's "Introduction to Protein Structure" [Garland ISBN 0-8153-2305-0]. Physicists in particular might find the lack of general unifying principles in this area overwhelming. Unfortunately there's no substitute for acquiring a "feel" from the subject by examining a lot of examples. Still the most critical stages in the successful prediction of protein structure from sequence are those requiring human intervention.

    Thomas E. Creighton has been responsible for a range of standard texts on protein chemistry. If you are working in a protein lab you are likely to come across his "Protein Function : A Practical Approach" [ISBN 019963615X] and the rather more expensive and theoretical "Proteins : Structures and Molecular Properties" [ISBN 071677030X]

    Evolutionary biology

    It's a worn quote, but worth repeating:

    "The mechanisms that bring evolution about certainly need study and clarification. There are no alternatives to evolution as history that can withstand critical examination. Yet we are constantly learning new and important facts about evolutionary mechanisms. Nothing in biology makes sense except in the light of evolution."

    Theodosius Dobzhansky in "American Biology Teacher" vol.35

    Darwin's theory is one of the simplest and most misunderstood in science. Start with a good layperson's introduction, Richard Dawkin's "The Selfish Gene" (and remember: it's a metaphor, stupid) or Steve Jones' paraphrasing of Darwin's original "The Origin of the Species" "Almost Like a Whale". All biologists agree on the underlying principles, but they are nearly ready to kill one another over the details. After reading a decent book on evolutionary biology you should have at least a handful of good questions. Now you are ready to take a class in the subject. Take your questions with you. You'll probably start an argument---or a fight.

    You might also like to peruse Cynthia Gibas's answers to similar questions from computational scientists on the O'Reilly Web site.

    These damned biologists are making me use Word instead of LaTeX to write up---what can I do?

    Try this.

    More general advice

    Use the software

    Get access to an installation of EMBOSS and/or Staden and get someone to lead you through the tools available. RasMol is a simple, but powerful and elegant molecular imaging program which can teach you a great deal about biological macromolecules; try a tutorial. Get out on the Web and do some productive surfing for a change :-) . The best starting point is the Human Genome Mapping Project Resource Centre's "GenomeWeb". There's so much stuff out there -- and most of it is free to academics.

    Where can I find Bioinformatics jobs?

    Start here at Bioinformatics.Org's Job Announcements Homepage...

    Then move on to the appointments / careers sections of the the major scientific journals, or, better, search their Web jobs pages with "bioinformatics":

    Appropriately for a Web-dependent discipline, there are a variety of specialist commercial Web sites which carry bioinformatics jobs:

    There are also a number of companies actively recruiting in the area. Here are a few:

    Practical tips

    This section includes some simple rules-of-thumb to apply when performing common bioinformatics tasks. I try to give a reference to a more detailed source of guidance where I know of one.

    How do I find a sequence?

    The most common task in bioinformatics must be the acquisition of some bioinformatics data on which to operate. Usually this in the form of a nucleic acid or protein sequence, stored as characters in the appropriate alphabet together with a header of related information: for example some kind of unique identifying number the species from which the original biological substrate was obtained, the names of any authors who published the sequence and so on.

    You may have already generated your own sequence data experimentally. In this case you are likely to want to find sequences which are identical or similar (and therefore possibly related) to yours. The task is then one of similarity search.

    ...I have a description.

    A paradoxical problem generated by the success of the bioinformatics revolution is the increasing difficulty of navigating the huge amount of data available. Once you could print out most of the existing sequence databases onto paper and cram them into a single binder. Now a search for "actin" alone will pull out hundreds and hundreds of sequences. The key to find what you want is to develop your own discriminatory skills rather than rely on computers to figure out what it is you're really after.

    Use Entrez-PubMed

    Make sure you are clear about your aim first. If you are looking for a sequence for a specific scientific purpose then you might be best to start with a relevant human-generated publication. For example, you have cloned a gene which is part of a well-characterised biochemical pathway and you want to find other sequences of the same functional gene product in other species (orthologues) Entrez PubMed is your friend.

    PubMed is a huge and very comprehensive database of the biomedical scientific literature., created by the U.S. National Library of Medicine (NLM). Entrez PubMed is another indispensable resource of the U.S. National Centre for Biotechnology Information (NCBI). Both are part of the U.S. Department of Health and Human Services National Institutes of Health

    Use Swiss-Prot

    Swiss-Prot is curated by human beings.

    Use SRS at the RFCGR

    [XXXX INSERT DETAILED ADVICE HERE]

    Use Boolean logic

    [XXXX INSERT DETAILED ADVICE HERE]

    Use cunning

    [XXXX INSERT DETAILED ADVICE HERE]

    ...I have an accession number.

    [XXXX INSERT DETAILED SEQUENCE ADVICE HERE]

    ...I have another sequence.

    This section will be expanded---and there will be a more basic and detailed explanation for novice searchers, but, in the meantime, here are the top tips cribbed from the excellent paper by Hugh B. Nicholas Jr., David W Deerfield II and Alexander J. Ropelewski in BioTechniques.

    • Use a local favourite program on the Web server of your choice.
    • Use at least two and preferably three similarity tables.
    • If using Smith-Waterman or FASTA algorithms ensure that the gap opening penalty is high enough.
    • If the initial search finds no or insufficient matches repeat it with a highly diverged matrix and/or with a Smith-Waterman-based server.
    • If this doesn't work try switching from a PAM matrix to a BLOSUM matrix.

    ...I'm not sure whether or not to use the defaults.

    Hugh, David and Alexander again on when not to use the default search parameters provided by a server.

    • ...when the homologues you are looking for to match your query are highly diverged.
    • ...when the query or matches are short.
    • ...when you are only interested in a specific (in the sense of "species") subset of database matches with a particular evolutionary relationship to your sequence of interest---a relationship not implied by the default settings.

    How can I align two sequences?

    This section will also be expanded for newbies, until then, here are Hugh, David and Alexander's tips for alignment:

    • Use an appropriately divergent matrix (I'll be adding a table soon to explain this).
    • Reduce your gap penalty relative to that you used for your database search.
    • Use the MaxSegs/Waterman-Eggert version of the dynamic programming algorithm to provide the best local alignment and also to search for repeats.

    How can I predict the function of a gene (product)?

    [XXXX INSERT FUNCTION PREDICTION ADVICE HERE]

    How can I predict the structure of a sequence?

    You could start with anyone of these excellent guides (listed strictly in alphabetical order):

    How can I simulate a biomolecule?

    Here's Peter J. Steinbach's "Introduction to Macromolecular Simulation"

    How can I write up?

    Go here to download some detailed advice. Go here for more links.

    Glossary of bioinformatics terms

    Here I attempt to define some common terms in bioinformatics. I have tried to balance clarity, brevity and rigour. Let me know if I let one of these priorities over-ride the others.

    What is an alignment?

    When two symbolic representations of DNA or protein sequences are arranged next to one another so that their most similar elements are juxtaposed they are said to be aligned. Many bioinformatics tasks depend upon successful alignments. Alignments are conventionally shown as a traces.

    In a symbolic sequence each base or residue monomer in each sequence is represented by a letter. The convention is to print the single-letter codes for the constituent monomers in order in a fixed font (from the N-most to C-most end of the protein sequence in question or from 5' to 3' of a nucleic acid molecule). This is based on the assumption that the combined monomers evenly spaced along the single dimension of the molecule's primary structure. From now on I shall refer to an alignment of two protein sequences.

    Every element in a trace is either a match or a gap. Where a residue in one of two aligned sequences is identical to its counterpart in the other the corresponding amino-acid letter codes in the two sequences are vertically aligned in the trace: a match. When a residue in one sequence seems to have been deleted since the assumed divergence of the sequence from its counterpart, its "absence" is labelled by a dash in the derived sequence. When a residue appears to have been inserted to produce a longer sequence a dash appears opposite in the unaugmented sequence. Since these dashes represent "gaps" in one or other sequence, the action of inserting such spacers is known as gapping.

    A deletion in one sequence is symmetric with an insertion in the other. When one sequence is gapped relative to another a deletion in sequence a can be seen as an insertion in sequence b. Indeed, the two types of mutation are referred to together as indels. If we imagine that at some point one of the sequences was identical to its primitive homologue, then a trace can represent the three ways divergence could occur (at that point).

    Biological interpretation of an alignment

    A trace can represent a substitution:

    AKVAIL  
    AKIAIL  

    A trace can represent a deletion:

    VCGMD  
    VCG-D  

    A trace can represent a insertion:

    GS-K  
    GSGK  

    For obvious reasons I do not represent a silent mutation.

    Traces may represent recent genetic changes which obscure older changes. Here I have only represented point mutations for simplicity. Actual mutations often insert or delete several residues.

    What is a DNA array?

    Thanks to Bioinformatics.Org member Ravi Jain for the following answer, which I present verbatim.

    DNA microarrays consist of thousands of immobilized DNA sequences present on a miniaturized surface the size of a business card or less. Arrays are used to analyze a sample for the presence of gene variations or mutations (genotyping), or for patterns of gene expression, performing the equivalent of ca. 5 000 to 10 000 individual "test tube" experiments in approximately two days of time.

    Robotic technology is employed in the preparation of most arrays. The DNA sequences are bound to a surface such as a nylon membrane or glass slide at precisely defined locations on a grid. Using an alternate method, some arrays are produced using laser lithographic processes and are referred to as biochips or gene chips. The composition of DNA on the arrays is of two general types:

    • Oligonucleotides or DNA fragments (approximately 20-25 nucleotide bases). These arrays are frequently used in genotyping experiments. The sequences of alternate gene forms may be included for detection of mutations or normal variants (polymorphisms).
    • Complete or partial cDNA (approximately 500-5 000 nucleotide bases). These arrays are generally used for relative gene expression analysis of two or more samples; however, oligonucleotide-based arrays may also be used for these studies.

    DNA samples are prepared from the cells or tissues of interest. For genotyping analysis, the sample is genomic DNA. For expression analysis, the sample is cDNA, DNA copies of RNA. The DNA samples are tagged with a radioactive or fluorescent label and applied to the array. Single stranded DNA will bind to a complementary strand of DNA. At positions on the array where the immobilized DNA recognizes a complementary DNA in the sample, binding or hybridization occurs. The labeled sample DNA marks the exact positions on the array where binding occurs, allowing automatic detection. The output consists of a list of hybridization events, indicating the presence or the relative abundance of specific DNA sequences that are present in the sample.

    What is a homologue?

    "Homology" is a much-misused term and existed in biology long before the notion of protein sequences. Strictly homology cannot be qualified; it is not correct to state that two proteins are "30% homologous" with each other, for example. If we could look back far enough in the evolutionary histories of any two molecules under comparison, we would be guaranteed to find a common ancestor eventually, but this is not true homology. An example of this would be the relationship between two variants of a single ancestral enzyme resulting from a gene duplication event.

    As a rule-of-thumb, true homology should be assigned only when the feature which leads us to suspect a relationship between molecules is one we consider likely to have derived from the molecules' common ancestor. To quote Page and Holmes [Molecular Evolution: A Phylogenetic Approac, Roderick D. M. Page and Edward C. Holmes; Blackwell Scientific; ISBN 0865428891]:
    "The classic molecular example is the parallel evolution of amino acid sequences in the lysozyme enzyme in leaf-eating langur monkeys and in cows. Both animals have independently evolved foregut fermentation using bacteria, and in both cases lysozyme has been recruited to degrade these bacteria. Therefore, langur and cow lysozymes are homologous as genes; however, as digestive enzymes they are not homologous because this functionality was not present in the ancestral lysozyme"
    Although sequence determines structure, it is possible for two proteins to have very different sequences and functions and share a common fold. In fact, most gene products with similar three-dimensional structures are insufficiently similar at the sequence level for true homology or analogy (non-homologous similarity) to be distinguished.

    What is an ontology?

    Biology is changing from being a descriptive to an analytical science. Accurate and consistent descriptions are, however, vital to analysis. The idea of ontologies has been co-opted from philosophy and artificial intelligence to partition bioinformatic knowledge in a way which can be reliably navigated by computers.

    This preprint of a review by Ele Holloway of the European Bioinformatics Institute gives a more detailed insight into the varied approaches to ontologies in bioinformatics by covering a recent meeting on the subject. The final version appears in Comparative and Functional Genomics.

    What is a scoring matrix?

    The following explanation was edited from a contribution by Amelie Stein.

    The aim of a sequence alignment, is to match "the most similar elements" of two sequences. This similarity must be evaluated somehow. For example, consider the following two alignments:

    (a)
                         AIWQH                       AL-QH                     
    (b)
                         AIWQH                       A-LQH                     

    They seem quite similar: both contain one "indel" and one substitution, just at different positions. However, if we think of the letters as amino acid residues rather than elements of strings, alignment (a) is the better one, because isoleucine (I) and leucine (L) are similar sidechains, while tryptophan (W) has a very different structure. This is a physico-chemical measure; we might prefer these days to say that leucine simply substitutes for isoleucine more frequently---without giving an underlying "reason" for this observation.

    However we explain it, it is much more likely that a mutation changed I into L and that W was lost, as in (a), than that W changed into L and I was lost. We would expect that a change from I to L would not affect the function as much as a mutation from W to L---but this deserves its own topic.

    To quantify the similarity achieved by an alignment, scoring matrices are used: they contain a value for each possible substitution, and the alignment score is the sum of the matrix's entries for each aligned amino acid pair. For gaps (indels), a special gap score is necessary---a very simple one is just to add a constant penalty score for each indel. The optimal alignment is the one which maximizes the alignment score.

    PAM matrices are a common family of score matrices. PAM stands for Percent Accepted Mutations, where "accepted" means that the mutation has been adopted by the sequence in question. Thus, using the PAM 250 scoring matrix means that about 250 mutations per 100 amino acids may have happened, while with PAM 10 only 10 mutations per 100 amino acids are assumed, so that only very similar sequences will reach useful alignment scores.

    PAM matrices contain positive and negative values: if the alignment score is greater than zero, the sequences are considered to be related (they are similar with respect to the used scoring matrix), if the score is negative, it is assumed that they are not related. "Relationship" here may refer to evolution as well as functionality of the proteins, and of course the choice of the matrix affects the result, so one has to make an assumption on the similarity of the sequences in order to receive a useful result: rather distant sequences won't produce a good alignment using PAM 10, and the optimal aligment of two very similar sequences with PAM 500 may be less useful than that with PAM 50.

    Finally, it should be noted that only some scoring matrices use similarity to evaluate alignments, but others use distance, so the be careful interpreting the results!

    After this brief and necessarily superficial overview, you might want to read some more about scoring matrices.

    Acknowledgements

    Questions

    Thanks to the following people for questions:

    • Jonathan Després
    • Salma B. Rafi
    • "Ritu"
    • Amelie Stein
    • Michael Wentzel

    Links

    Thanks to the following people for corrections, links and sources:

    • Anuradha Acharya
    • Charles Adair
    • Rahul Agrawal
    • Aja
    • Ken Allen
    • Tom Andrews
    • M Antro
    • Aditi Arur
    • Paulo Almeida
    • Jeff Ames
    • Jim Auer
    • Will Bachelor
    • Justin Baker
    • Javier Rojas Balderrama
    • Nigel Barber
    • Risabh Bhandari
    • Ruediger Braeuning
    • Ian A Brewis
    • Pierre Bushel
    • Debra Burhans
    • Andrea Cabibbo
    • Chua Hian Koon
    • Betty Cheng
    • Leonard Crane
    • Fiona Croll
    • Paul Curley
    • Danushka
    • David Delong
    • Maureen Downey
    • Steffen Durinck
    • Lynda Ellis
    • Rafiu Fakunle
    • Pedro Fernandes
    • Matthew Foster
    • Gad
    • Momchil Georgiev
    • Sebastien Gerega
    • Jesmminder Gill
    • Georges Grinstein
    • Mike Goodrich
    • Brandon H.
    • Maximilian Haeussler
    • Abdul Hameed
    • Anu Haniharan
    • Samuel Hargestam
    • Clare Hayes
    • H. L. Hiew
    • Ele Holloway
    • Matt Hope
    • Benjamin Horsman
    • Brant Inman
    • Ivanovi
    • Pooja Jain
    • Jim
    • Andrew Johnson
    • Bulat K
    • Tobias Kailich
    • Erik Kanders
    • Kevin Karplus
    • Beatrice Kilel
    • Gerd Klaassen
    • David Klemitz
    • Peter Kublik
    • Sebastian Kurscheid
    • Dominic Lau
    • Raymond Lau
    • Darren Lee
    • Wentian Li
    • Louis Licamele
    • Jeff Ligas
    • Olga Likhodi
    • Thomas Litman
    • Steve Masticola
    • Matt at ColorBasePair.com
    • James McInerney
    • Conor Meehan
    • Junaid A. Mehta
    • Moustafa
    • Lisa Mullan
    • David Murphy
    • Feisal Merican
    • Markus Montigel
    • Dr. Nagesh
    • Pablo Nehab-Hess
    • Alex O'Neill
    • Brittany Nielsen
    • Fiona Nielsen
    • Daniel Nilsson
    • Rachel Oh
    • Martin Okrslar
    • Bjorn Olsson
    • Uma Parameswaran
    • Fabio Pardi
    • David Parkinson
    • Helen Parkinson
    • Rama Penta
    • Isabelle de Piedade
    • Jean-Etienne Poirrier
    • William S. Preissner
    • Antony Quinn
    • Jeremy Read
    • G. Deepak Reddy
    • Alexandra Reitelmann
    • Judith Risse
    • Francisco Rocha
    • John Rowland
    • Vishal Rupani
    • Amit Sabnis
    • Manuel Schmidt
    • Sentausa
    • Cathal Seoighe
    • Niranjan Swaroop Sharma
    • Richard Sheehan
    • Nihar Sheth
    • Bolanle Shoge
    • Alfred Simbun
    • Sugandha Singhal
    • Vaibhav Sinha
    • Amelie Stein
    • Jennifer Steinbachs
    • Mattias Thorslund
    • James Thompson
    • Natalie Twine
    • Eric VanWieren
    • Catherine Velazquez
    • Lam Ah Wah
    • Jonathan Watts
    • Kathy Wiederin
    • Linda Wilson
    • Yam
    • Tim Young
    • Zuthur Yew
    • Tim Young
    • Racheli Zakarin
    • Hussein Zedan
    • Humberto Ortiz Zuazaga
    • Michael Zuker

    Answers

    Thanks to the following people for suggesting answers:

    • Jeff Bizzaro
    • Paul Boardman
    • Ravi Jain
    • Alex Kasman
    • Sangeeta Sawant
    • Fredj Tekaia
    • Jo Wixon

    Small Print:

    Author and licensing

    This resource is maintained by and © Damian Counsell, UK Medical Research Council Rosalind Franklin Centre for Genomic Research (the RFCGR) 1998-2004. It is made available under a modified version of the Open Publication Licence.

    The FAQ has also been mirrored, without credit or any attempt to link to the Open Content Licence, at the so-called "National Bioinformatics Institute". If you are thinking of handing over money for their "certification" you can draw your own conclusions about their standing from this fact.

    The first version of this Bioinformatics FAQ was prepared when I was responsible for bioinformatics in the Section for Cell and Molecular Biology at the Institute of Cancer Research (the ICR) in London.

    I am now a Bioinformatics Specialist at the Rosalind Franklin Centre for Genomics Research, part of the Proteomics Group and am supported by the Medical Research Council. This page does not represent their views, but I will happily read your criticisms. Although I may act on your advice I take no responsibility for anything that might happen if you browse here.

     

     

    WARNING: SYSTRANLinks did not translate the document entirely. The document exceeds the maximum size allowed by the solution. ( 65536 bytes for HTML)