Archives for Data Technologies
Physical Database Design and Tuning – Oracle or SQL Server
Troubleshooting Database Performance - 3 Broad Categories Physical Database Design Query Statement Tuning DB Configuration Physical Database Design INDEXING - Look for fragmentation, If frag > 0% , try rebuild…
Couchbase vs DynamoDB
Couchbase Advantages Run on almost any cloud platform – including AWS Avoid DynamoDB’s item-size restrictions Speed up performance with in-memory processing and built-in caching Use your team’s existing SQL skills…
MariaDB auto update statistics
To check if auto update is enabled on statistics, try this command show variables like '%metadata%'; If you see an output such as: innodb_stats_auto_recalc = 1, you're all If you…
The BigData Landscape
This is a Work in Progress… Pre Processing of Data (Un-Structured) Map Reduce Pre Processing of Data (Structured or Semi-Structured) PIG Hive Hadoop (see below) Statistical Analysis (After Pre-Processing) R…
Running out of disk space? Sharding
What is automatic sharding? Sharding is a type of database partitioning that separates very large databases the into smaller, faster, more easily managed parts called data shards.
Data warehouse versus data marts
Most data warehousing initiatives fail (mainly because this level of standardization slows down an agency/company enough that the project gets derailed; boiling the ocean phenomenon). Avoid building a Data Warehouse…
Multiple copies of data–and editability
A book has an author –an Author has multiple books. A book would be modeled as a document in NoSQL – as would an author. So - we end up…
Tracking relationships in NOSQL
In NoSQL, there is no way to 'relate' the post with the comments. So, what do you do? Well - you essentially store the postId and the commentId - for…
NoSQL and data integrity
Redundant Data Storage NoSQL stores many to many relationships in the same way that de-normalized tables do – by storing them redundantly. Since you do not base your NoSQL design…