#Python

  • Pandas functions instead of iteration

    Coming back to using Python and Pandas from GoLang has made me aware of the quirks of using dataframes in the place of typed data structures.

    While pandas has great convenience features for basic data manipulations on tables, munging get trickier in places you’d want to use a map or hash in other languages. Actually common, and while pandas has a MultiIndex feature, it is a mistake to try to use these with the common Python iterator pattern for star in stars: syntax. Doing this in pandas on cmplex, large datasets can be inefficient and slow. Functions are the faster, more efficient way to do this.

  • Mapping global viral epidemics in real time

    The intersection of datascience and epidemiology offers amazing opportunities to enhance quality of life for vast swathes of at-risk people. A new superpower conferred by the advent of cheap, rapid genetic sequencing. It might even help us avert the increasingly likely risk of the next global pandemic.

    H7N9 viral
spread

    Another great example of the power of off-book, part-time projects, NextStrain was hatched after the two researchers responsible met at a conference and talked about the idea. The reseachers who developed it, Richard Neher of the University of Basel and Trevor Bedford of the Fred Hutchinson Cancer Research Center in Seattle both recently won the Open Science Prize for their contributions.