SERVICE WE PROVIDE

We are Data Service Experts

From data integration to Data Science, we have all data related services you may need.

Data and Application Integration

From Data Hub, ESB to Application orchestration, we bring the seamless harmony for enterprises

Data Warehousing and BI

Data Warehousing, Data Lake, Data Analytics and Business Intelligence

Data Sicence and AI

Data Science, Machine Learning and Artificial Intelligence

Data Platforms

Our Products

We provide matured and testified data service platforms to suite your latest and critical business needs. These products includes: Open Intelligent Data Platform, Customer 360 Analytics, Recommender System and Enterprise Data Hub etc.

Open Intelligence Data Platform

Customer 360 Analytics

Recommender System

Enterprise Data Hub

TESTIMONIALS

Our Clients Says

Expert in Data Service Domain with the right solutions and the right cost!

Williams Moore

Director, BI and Analytics

Bring enterprise data from different silos together and seamlessly link them for a 360 views of our whole business, which has been hindering our business for a long time!

Our Blog

Latest News

Tech insights from the industry insiders

Application Integration

3 Pillars of Manufacturing Information System

Jan 2, 2021 systan 2 Comments

Data Services

Data Lake, Data River and Data Droplets

Dec 16, 2020 systan 2 Comments

I think you already have heard about data lakes. They used be called data directories. As you would expect, Data Rivers end up their “streams” in the lake. Here we go with data ponds:Connected Data Ponds: The Evolution of Data Lakes – HortonworksA lot has been said about Data Lakes over the past five years. The call to action from our industry to customers was to…hortonworks.com

Data ponds are subsets of data lakes that are separated for privacy (i.e. PII), governance, technology or costs.

Data droplets are the basic element. They describe information and dimensions about the subject. Here you can read more about these ontologies.

Then, we have data swamp. Larger organizations have this issue as a more severe one. The image below explains the differences:

Image for post — Image from: DatAvail.com

There are many reason behind a data swamp, below are a few:

No policy for the metadata, definition, or the process
Missing life-cycle for the data in the lake
No stakeholder in the organization for the data
Missing documentation about the preparation/usage process of the data

Bigger companies have started to find a solution for this issue. Metacat from Netflix help to understand the metadata in different services, or if you want to keep it simple with an user interface, CKAN data portal can help you manage and govern your data.

Data Science

User-based Collaborative Filtering

Feb 18, 2020 systan 2 Comments

User-Based Collaborative Filtering (UB-CF)

Imagine that we want to recommend a movie to our friend Stanley. We could assume that similar people will have similar taste. Suppose that me and Stanley have seen the same movies, and we rated them all almost identically. But Stanley hasn’t seen ‘The Godfather: Part II’ and I did. If I love that movie, it sounds logical to think that he will too. With that, we have created an artificial rating based on our similarity.

Well, UB-CF uses that logic and recommends items by finding similar users to the active user (to whom we are trying to recommend a movie). A specific application of this is the user-based Nearest Neighbor algorithm. This algorithm needs two tasks:

1.Find the K-nearest neighbors (KNN) to the user a, using a similarity function w to measure the distance between each pair of users:

2.Predict the rating that user a will give to all items the k neighbors have consumed but a has not. We Look for the item j with the best predicted rating.

In other words, we are creating a User-Item Matrix, predicting the ratings on items the active user has not see, based on the other similar users. This technique is memory-based.

PROS:

Easy to implement.
Context independent.
Compared to other techniques, such as content-based, it is more accurate.

CONS:

Sparsity: The percentage of people who rate items is really low.
Scalability: The more K neighbors we consider (under a certain threshold), the better my classification should be. Nevertheless, the more users there are in the system, the greater the cost of finding the nearest K neighbors will be.
Cold-start: New users will have no to little information about them to be compared with other users.
New item: Just like the last point, new items will lack of ratings to create a solid ranking.

Trusted By Worldwide Businesses. Try Today!

looking For Professional Approach & Qaulity Services!

Get Started Now!

We are Best in Data Services

Telecommunications

Manufacturing

Transportation