Data collection and data storage are the first stages in the data lifecycle.

The collection stage consists of creating a communication route between the information and storage. The latter consists of gathering and keeping all considered information deemed relevant. To do this it is necessary to determine what information we want to collect as well as the infrastructure in which it is to be saved. Data collection and storage are thus necessary stages to any Big Data projects.

Through the rise of the Internet of Thing (IoT) and digitalisation, information is omnipresent and reaches all fields. In marketing, data may come from mobile apps, website activities or ticket sales and loyalty cards. In industrial fields, data often comes from sensors that measure different elements such as vibrations, electric voltage or temperatures in machines and engines. Data from résumés for human resources, emails and even press for text based data are all examples of possible sources of information.

The technologies used in the data collection phase are directly influenced by the type of object that is collected and how accessible it is.

For example a website can offer a web service allowing our team to easily access information, however it is also possible that there is no access and that a scrapping solution must be put into place. Finally, certain websites directly market and sell their data. In this case data is then automatically downloadable or can even be sent via email. On the other hand, engine sensors generally communicate via specific protocols such as OPC UA.

The collected data is then sent to a common and centralized storage structure called a Data Lake. This architecture not only allows our team to gather and link different data sources but it also offers a side to side storage solution for raw and conform data. Finally, according to your needs and business policies, the storage can be in-house or through an external cloud solution. We adapt our solutions to your needs, in an agile manner, taking your requirements into account.

Through our data collection and storage services, Swiss-SDI offers you follow-up and support from start to finish in your data projects: from the data collection all the way to your data analysis.


Would you like to start, improve or automatize a data collection process?

Contact Us!

Use Cases


Sensor data collection

Any machine learning project must start with a harvesting phase, if no data is already available. This is particularly true for industry 4.0 projects. Swiss-SDI supports you from these first challenges by offering you a collection solution adapted to your needs: installation of sensors, recovery of information on the web, database cross-referencing.


Storage Solutions

Ensuring data life is very important to extract added value. To deal with the persistence of your data, our team uses various technologies ranging from relational databases to NoSQL, which we install in-house or in a cloud. Swiss-SDI adapts to your ecosystem by offering you the right technology.


Extraction of textual data

Text is probably the most common type of data used to transmit information. E-mails, CRM, archives, newspapers: it can be found in many sources and in various formats (word, pdf, html, etc. documents). Swiss-SDI provides you with its text mining skills in order to automatically extract information from this unstructured data for mail classification or to analyze the content of press articles for example.