Štefan Urbánek
data brewmaster, systems architect, knowledge designer
You have data? Or you think you might have data somewhere and you do not know how to get them? You want to know what the data say?
You know what you know to get from your data, but you do not know how? Or you want to know what you can get from your data?
You want to know what is the quality of the data you are using? You want to know whether and how and to what extent the quality can be improved? Or at least how it can be measured?
How?
- First, we talk. You explain your problem, and together we prepare a requirements specification](#rs).
- If we are clear on what has to be done, I design a solution.
- I analyse and prepare your data.
- If required, I provide you with software to automate data processing and data monitoring.
What?
Technologies
Languages and environments: Python, Ruby, SQL
Databases: SQL: PostgreSQL, Oracle 8 to 10, NoSQL: MongoDB, CouchDB.
Data Mining: SPSS Clementine
Requirements Specification
- Help you to describe correctly, consistently, completely and unambiguously what you want.
- Reduce the development time and costs
- basis for cost and time estimation, will serve as basis for verification and will lower risks of unnecessary costs for change of requirements.
Data Automation
- Automatically perform of data checking, from business perspective
- automated data validation and data quality checks
Reporting
- what is the story behind your data?
- what is the relationship between entities or their behavior?
Data Audit
- Where your data come from and how? What is in your data? What is the structure? What are the dependencies and constraints?
Data Cleansing
- Identify duplicates,
- homogenize values and approach consistency
- trim, split, break-down, combine, derive, mix, burn, melt, chop
Data Quality Monitoring
- How complete are your data?
- How consistent and how up-to-date are your data?
- How can you improve the quality and up to what extent?
Data Architecture design
- How should data structures and their relationship look like for given purpose?
- Create entity-relationship model, multidimensional models.
Extraction, transformation, loading
- Development of extraction, transformation and data loading tools
- Dutomated process monitoring and notifications