Data Science

 

We have been working with three types of data science projects:

 

- Applications: We use data science tools in order to extract relevant information about a specific phenomenon. The main focus is the application and the goal of a thesis is to produce results relevant enough that can be published in a conference/journal specialized in the data's domain. We have been working on the prediction of non-coding RNAs and the reconstruction of regulatory networks from microarray data, but we are trying to apply machine learning tools to other domains.

 

     Master thesis of this type usually require the student to have an advisor specialized on the application's domain along with a member of the laboratory.

 

- Distributed/Parallel implementations of Machine Learning algorithms: The advances in technology in the recent years have allowed us to collect large amounts of data, for example, millions of transactions in sites like Amazon are being made daily and thousands of human DNAs are being sequenced every week. In order to analyze such large amount of data is necessary to make distributed/parallel implementations of Machine Learning algorithms.

 

- Understanding ML algorithms: Projects of this type are about understanding how machine learning algorithms behave with respect to its parameters and with respect to the properties of the data.

 

  There are also machine learning algorithms based on heuristics and our goal is to obtain formal results of why those heuristics work.