Data Mining in the Cloud
Data Mining in the Cloud
Sceo is a Data Mining web application accompanied by an online resource site.
Sceo is a Data Mining web application accompanied by an online resource site. Read more
About this project
Sceo is a browser based Data Mining application. The application is accompanied by a web site which includes resources, such as tutorials, examples, forums, and answers to frequently asked questions, to ease the data mining learning curve.
Data Mining is a rapidly growing field of analysis which is used in multiple industries for a number of reasons. A good general definition of Data Mining is the analysis of raw information in order to produce useful, relevant knowledge. Data Mining allows users to analyze large amounts of raw data using different techniques. The outcome of the analysis is largely driven by the quality of information provided and the analysis technique used.
Sceo is a conversion of the Weka Data Mining application and libraries. It provides all the functionality of the desktop version in the form of a Web Application. Sceo contains tools for data pre-processing, classification, regression, clustering, association rules, and visualization. A quick overview of these tools and techniques is given below. Examples on using the tools can be found in the Examples and Case Studies section.
Sceo allows users to import, view, and re-work a number of files before processing. Because the state of the raw information is important for analysis, Sceo provides tools to help “condition” data ensuring the end results are more accurate.
Classification learning is an attempt to classify a new, or unknown, instance based on characteristics shared with previously classified examples.
Regression is perhaps the most easily understood and easy to use methods of analysis in Data Mining. Linear Regression, a form of regression modeling, is a widely used technique for numeric predictive modeling. A linear model provides users with an equation comprised of the sum of the data sets attributes with applied weights.
Clustering algorithms provide users with the ability to group instances based on similarities. Clustering output lends itself well to visual analysis. Clusters can be easily determined and visualized in two or three-dimensional space allowing users to analyze and distinguish patterns.
Association analysis is the discovery of association rules or the likelihood that a certain instance will occur given that another instance occurred.
Sceo also provides tools for post data visualization and analysis. Post data analysis my be done through command line output analysis, such as regression model weighted analysis, viewing output in two dimensional graphs, or viewing decision tree structures.
How can it be used?
Sceo can be used to analyze large amounts of raw information, which is searched for defining characteristics and patterns. The patterns collected can then be used for classification, data modeling, and trend prediction. Below is a small list of areas in which Sceo may be used.
- Medical Discovery
- Business Analytics
- Scientific Discovery
Sceo User Interface
Online Support and Examples
A large amount of online articles, case studies and examples will be provided, helping ease the learning curve and inspiring users to purse their own projects.
Online community forums will allow users to share information, collaborate on projects, and aid others in solutions.
Registration and Login
The Sceo data mining software is open source and falls under the GNU General Public License.
Support this project
- (30 days)