Projekte

DevOp at GfK SE online market research (Apr. 2017 – Sept. 2017)

The Project

  • Data from many different devices (PC, smartphone…) are sent to a collector cloud
  • The ETL system periodically fetches data, transforms and loads them to HDFS / Hive
  • Some reports are already generated based on the cleaned data

My Tasks

  • Take care of all running ETLs
  • Look for possible problems
  • Fix bugs in the ETL software

Techniques Used

  • Cloudera CDH4, CDH5, Hadoop, Hive, Oozie, Pig, HUE
  • Icinga, Grafana2, Eclipse, Gradle, Maven
  • Linux, Git/Stash/Bitbucket, Confluence, Bamboo, Scrum

 

Developer at GfK SE online market research (June 2016 – Mar. 2017)

The Project

  • Data from different metering platforms have to be loaded into a datalake
  • Before loading the data has to be enriched, transformed, and partially aggregated, as well as deduplicated
  • Incoming and outgoing data has to confirm to different schema versions

My Tasks

  • Design a complex dataflow pipeline in Crunch and Beam
  • Tune and test the pipeline in a real world scenario
  • Manage and implement transition from Hadoop to Spark

Techniques Used

  • Cloudera CDH5, Hadoop, Hive, Oozie, Spark, Crunch, Beam, Kite …
  • Icinga, Graphite, Eclipse, Gradle
  • Linux, Git/Stash, Confluence, Bamboo, Scrum

 

DevOp at GfK SE online market research (Okt. 2015 – May 2016)

The Project

  • Data from many different devices (PC, smartphone…) are sent to a collector cloud
  • The ETL system periodically fetches the data, transforms and loads them to HDFS or Hive
  • Some reports are already generated based on the cleaned data

My Tasks

  • Take care of all running ETLs
  • Look for possible problems
  • Fix bugs in the ETL software

Techniques Used

  • Cloudera CDH4, Hadoop, Hive, Oozie, Pig
  • Icinga, Grafana, Eclipse, Gradle
  • Linux, Git/Stash, Confluence, Bamboo, Scrum

 

Evaluation, Design and Implementation of RTA Use Cases (May 2015 – Sept. 2015)

The Project

  • Evaluation of big data streaming systems
  • Decision support for choosing an RTA platform
  • Design and build example use cases

My Tasks

  • Evaluate different big data streaming systems (Spark, Flink, and others)
  • Design and implement different use cases for evaluation and proof of concept
  • Design and build a real time analytics platform

Techniques Used

  • Flink, Kafka, Zookeeper
  • Java8, Junit, Mockito, Maven, Eclipse
  • Git, Confluence, JIRA, Scrum

 

Development of a ERP system (Oct. 2014 – April 2015)

The Project

  • Create a web application for management of customer contracts
  • Manage accounting and billing
  • Support internal workflows

My Tasks

  • Database design
  • Create web application for employees

Techniques Used

  • Java7, Tomcat7, JPA2, Hibernate, Spring4 WebMVC, Spring Data,
  • DBUnit, Junit,
  • Maven2

 

Development of a CRM system (Oct. 2014 – May 2015)

The Project

  • Import, integrate and analyse data of customers
  • Create a web application for customers and employees
  • For a fincancial institution

My Tasks

  • Database design, import of CSV data, create integrated view on data
  • Design and Implement web applications
  • Create reports for controlling

Techniques Used

  • PHP 5.4, MySQL5, Java7, SQL

 

Stratosphere (research assistant) (2009 – 2014)

The Project

  • Big Data analytics system (now Apache Flink) in the Cloud
  • massively parallel data analysis, comparable Apache Hadoop
  • complex ad hoc analysis programs on very large data sets

My Tasks

  • developed different components in the database core
  • successful headed, designed and developed a meta data collection framework
  • developed extensible, high-performance modules combining harmoniously query execution and metadata (esp. statistics) collection at once
  • designed a distributed store with a central indexing component for very fast access to the metadata
  • planned and coordinated the work of 6 students working on the same project

Techniques Used

  • Java 6 and Java 7, Maven, Jenkins
  • Dataflow Language (Meteor), JSON
  • SQL, Hadoop
  • Libraries such as Kryo

 

Teacher Database Principles & Big Data Systems (2010 – 2014)

The Project

  • Educated students in architecture, designing, developing and programming databases and languages as SQL, Meteor, Hive, PigLatin, AQL, and so on
  • Gave excellent talks presenting Big Data systems such as Stratosphere, AsterixDB, Hadoop and others

My Tasks

  • successfully managed the course and teached students in „Principles of Database Systems“
  • managed the course and teached students in „Big Data Systems“
  • managed the course and teached students „Map/Reduce“

Techniques Used

  • entity relationship models, relational algebra, SQL, JDBC
  • Stratosphere, AsterixDB, Hadoop, IBM DB2
  • dataflow languages (Meteor, Hive, PigLatin), JSON, XML

 

Datawarehouse in a private insurance company (2007 – 2008)

The Project

  • insurance company introduced a new data warehouse system
  • had to re-develop highly complex SQL queries (now multitenant)
  • computed key performance indicators for the company

My Tasks

  • developed efficient and highly complex SQL queries over very big data
  • multitenant, robust

Techniques Used

  • SQL
  • MS Excel, MS Access

 

Java Developer in the financial sector (2001 – 2007)

The  Project

  • Web based information system for cash logistic
  • generated cost-optimal proposals for filling ATMs with cash
  • complete handling of cash refilling orders for ATMs (create, edit, issue, monitor and close up)

My Tasks

  • refactored a backend modul computing proposals for filling ATMs with cash
  • extended, tuned, and modularised the backend modul
  • involved in the whole software development cycle

Techniques Used

  • Java, SQL, Ant
  • Hibernate, Struts
  • DB-Design, DB-Tuning, programming triggers in PL/SQL on Oracle