Before SQL Server 2012 release, this product was considered a database management system for small and medium enterprises. Starting with the 2012 release, the database engine is no longer considered for medium-scale enterprises after adding high-end data-center management capabilities. In November 2019, SQL Server 2019 Big Data Clusters were introduced, giving the ability for users to build a Big Data ecosystem.
This article will briefly mention nine features added starting SQL Server 2008 that make SQL Server more than a traditional database management system.
This feature was added in SQL Server 2008 to be applied to tables and indexes. …
From many years, “Big Data” has become widespread and trendy. The Big Data technologies started to fill the gap between the traditional data technologies (RDBMS, File systems … ) and the high evolution of the data and business needs.
While implementing these technologies is a must for many large-scale organization to ensure the business continuity, many organization are aiming to adopt these technologies without really knowing if they can improve their business.
Before making your decision, there are many things you should take into consideration.
Before asking if your business needs Big Data technologies, you have first to know what…
Even after the rise of Big Data technologies, Microsoft SQL Server Integration Services still one of the most popular data integration tools. Mainly, SSIS developers use Visual Studio to develop their data integration packages. One of the main challenges that face the SSIS developers is that they design tens of hundreds of similar packages, where they need to recreate the package from scratch each time. Even if in SQL Server 2016 SSIS package parts were introduced to increase the reusability, many scenarios still require a higher level.
This article will mention four approaches that I have tried while working as…
As in the first release (1.0.0), SchemaMapper was developed to merge data from different file types (flat file, Excel, Access …) into one SQL table. SchemaMapper 1.1.0 was released after being improved to support reading from relational databases and writing data into more data formats. Also, SchemaMapper 1.1.0 is now available via NuGet package.
A few years ago, I was hearing from my colleagues, “don’t ever think about installing Hadoop on Windows operating system!”. I was not convinced of this saying because I am a big fan of Microsoft products, especially Windows.
In the past few years, I worked on several projects where we were asked to build a Big Data ecosystem using Hadoop and related technologies on Ubuntu. It was not so easy to work with these technologies, especially since there is a lack of online resources. Last month, I was asked to build a Big data ecosystem on Windows. …
This article is a part of a series that we are publishing on TowardsDataScience.com that aims to illustrate how to install Big Data technologies on Windows operating system.
Previously published:
In this article, we will provide a step-by-step guide to install Apache Pig 0.17.0 on Windows 10.
Apache Pig is a platform build on the top of Hadoop. You can refer to our previously published article to install a Hadoop single node cluster on Windows 10.
While working on a project, we were asked to install Apache Hive on a Windows 10 operating system. Many guides were found online but unfortunately, they didn’t work. For this reason, I decided to write a step-by-step guide to help others.
The starting point of this guide was from a great video I found on Youtube which provides a working scenario for Hive 2.x without much detail.
This article is a part of a series that we are publishing on TowardsDataScience.com that aims to illustrate how to install Big Data technologies on Windows operating system.
Other published articles in this…
While working on a project two years ago, I wrote a step-by-step guide to install Hadoop 3.1.0 on Ubuntu 16.04 operating system. Since we are currently working on a new project where we need to install a Hadoop cluster on Windows 10, I decided to write a guide for this process.
This article is a part of a series that we are publishing on TowardsDataScience.com that aims to illustrate how to install Big Data technologies on Windows operating system.
Other published articles in this series:
First…
SchemaMapper is a C# data integration class library that facilitates data import process from external sources having different schema definitions.
It can:
SchemaMapper is developed with .NET framework 4.5 …
These security methods increases the network…
Data Engineer, Ph.D. Candidate in Data Science