Lost in Translation: Dealing with Data Diversity

While Canada welcomes diverse cultures, researchers on the Efficient Dairy Genome Project are finding diverse data to be an unexpected challenge. As they gather information from international partners to apply genomics in raising feed efficiency and lower methane emissions in dairy cows, Genome Alberta scientists might be reminded of an old adage: Be careful what you wish for.

It takes a village

In studies like this, the more data you collect, the greater your chances of success. Because doing so involves considerable time and money, dairy researchers opted to combine data from various research groups around the globe who are working on the same traits. The data, which consists of information on genotypes, pedigrees and phenotypes, is coming from multiple partners including Australia, the United States, the U.K., Switzerland, Denmark and Canada.

“My focus so far has been describing the data sets for methane emission and feed efficiency gathered from our Canadian herd,” said Audrey Martin, research associate in the Department of Biosciences at the University of Guelph. “To describe these traits, I have to find the genetic parameters and develop a proper model to fit the data, and that requires more data than we can glean from a small herd.”


Don’t try this at home

To acquire the additional information in a cost-effective manner, Martin is working with a number of international data sets. It might sound simple enough, until you actually go to do it.

“Processing data from other countries is proving to be harder than we thought. The way we measure feed efficiency and methane emission in Canada is different from how it’s done elsewhere, so we’re trying to standardize everything, but that’s not always easy.”

For example, Canada uses a scale of 1 – 5 for body condition, whereas Australia’s scale goes from 1 – 8. To overcome the discrepancies, Martin must deal with the different scales and generate values that will work for her purposes.

It is slow going right now, but Martin is motivated by the project’s potential impact. Countries might measure key traits differently, yet they share an interest in the final outcome.

“The public might not understand the importance of improving feed efficiency and methane emission, but breeders and farmers in these countries certainly do. If cows can produce the same amount of milk with less feed required, the savings to industry could be substantial. At the same time, there is a worldwide effort to reduce greenhouse gases, and methane from cows is one of the sources.”

Just as collecting the data was a joint venture, so too will be the effort to standardize it.

“Though Canada is the central collection point for the data, the problem stems from the initial values compiled by each nation, so we’ll need everyone working together to solve this.”

For Martin, what started as a four month internship is turning into a longer stint on the project. That’s fine with her, and she’s optimistic going forward.

“Things are a bit messy right now, but all parties are pitching in and we’ll get this figured out. We have some good first results from very small samples, and we will continue to collect data after the project is done to maintain progress on these critical traits.”

Ultimately, dealing with diverse data and diverse cultures has something in common: a daunting challenge, but well worth the effort.