banner



Which Of The Following Statements Is True Of Unsupervised Data Mining?

Procedure of extracting and discovering patterns in large information sets

Data mining is a process of extracting and discovering patterns in large data sets involving methods at the intersection of car learning, statistics, and database systems.[1] Data mining is an interdisciplinary subfield of computer science and statistics with an overall goal of extracting information (with intelligent methods) from a data set and transform the information into a comprehensible structure for further use.[1] [two] [3] [iv] Data mining is the assay pace of the "cognition discovery in databases" process, or KDD.[five] Aside from the raw analysis step, it also involves database and data management aspects, data pre-processing, model and inference considerations, interestingness metrics, complexity considerations, post-processing of discovered structures, visualization, and online updating.[one]

The term "data mining" is a misnomer considering the goal is the extraction of patterns and noesis from large amounts of data, not the extraction (mining) of data itself.[6] It besides is a buzzword[7] and is ofttimes applied to any form of large-scale data or information processing (collection, extraction, warehousing, analysis, and statistics) too equally any application of computer decision back up system, including artificial intelligence (e.thousand., machine learning) and business intelligence. The volume Information mining: Practical automobile learning tools and techniques with Java [eight] (which covers more often than not machine learning material) was originally to be named just Practical machine learning, and the term information mining was only added for marketing reasons.[9] Often the more than full general terms (large scale) data assay and analytics—or, when referring to actual methods, artificial intelligence and motorcar learning—are more appropriate.

The actual data mining job is the semi-automated or automated analysis of big quantities of data to excerpt previously unknown, interesting patterns such as groups of data records (cluster analysis), unusual records (anomaly detection), and dependencies (association dominion mining, sequential pattern mining). This ordinarily involves using database techniques such every bit spatial indices. These patterns tin can then be seen as a kind of summary of the input data, and may exist used in further analysis or, for instance, in auto learning and predictive analytics. For case, the data mining step might identify multiple groups in the data, which tin then be used to obtain more accurate prediction results by a decision support system. Neither the data drove, data preparation, nor result interpretation and reporting is part of the data mining step, but do belong to the overall KDD process as boosted steps.

The deviation between information analysis and information mining is that data analysis is used to test models and hypotheses on the dataset, e.yard., analyzing the effectiveness of a marketing campaign, regardless of the corporeality of information; in dissimilarity, information mining uses machine learning and statistical models to uncover undercover or hidden patterns in a large book of data.[x]

The related terms data dredging, information line-fishing, and data snooping refer to the use of data mining methods to sample parts of a larger population data gear up that are (or may be) also pocket-size for reliable statistical inferences to be made about the validity of any patterns discovered. These methods tin, however, be used in creating new hypotheses to test against the larger data populations.

Etymology [edit]

In the 1960s, statisticians and economists used terms like data angling or data dredging to refer to what they considered the bad do of analyzing data without an a-priori hypothesis. The term "data mining" was used in a similarly critical way by economist Michael Lovell in an commodity published in the Review of Economic Studies in 1983.[11] [12] Lovell indicates that the do "masquerades under a diversity of aliases, ranging from "experimentation" (positive) to "angling" or "snooping" (negative).

The term data mining appeared around 1990 in the database community, with mostly positive connotations. For a short time in 1980s, a phrase "database mining"™, was used, but since it was trademarked by HNC, a San Diego-based company, to pitch their Database Mining Workstation;[thirteen] researchers consequently turned to information mining. Other terms used include data archæology, information harvesting, information discovery, noesis extraction, etc. Gregory Piatetsky-Shapiro coined the term "knowledge discovery in databases" for the offset workshop on the same topic (KDD-1989) and this term became more popular in AI and machine learning community. However, the term data mining became more popular in the business organisation and press communities.[xiv] Currently, the terms data mining and knowledge discovery are used interchangeably.

In the academic community, the major forums for research started in 1995 when the First International Conference on Data Mining and Knowledge Discovery (KDD-95) was started in Montreal nether AAAI sponsorship. It was co-chaired by Usama Fayyad and Ramasamy Uthurusamy. A year later, in 1996, Usama Fayyad launched the journal past Kluwer called Information Mining and Knowledge Discovery as its founding editor-in-main. Later he started the SIGKDD Newsletter SIGKDD Explorations.[xv] The KDD International conference became the primary highest quality conference in data mining with an acceptance rate of enquiry newspaper submissions below 18%. The journal Data Mining and Knowledge Discovery is the principal research journal of the field.

Background [edit]

The manual extraction of patterns from data has occurred for centuries. Early methods of identifying patterns in information include Bayes' theorem (1700s) and regression analysis (1800s).[xvi] The proliferation, ubiquity and increasing ability of calculator technology take dramatically increased data collection, storage, and manipulation ability. As data sets take grown in size and complexity, direct "easily-on" data analysis has increasingly been augmented with indirect, automated data processing, aided past other discoveries in computer science, particularly in the field of machine learning, such as neural networks, cluster analysis, genetic algorithms (1950s), decision trees and determination rules (1960s), and back up vector machines (1990s). Data mining is the process of applying these methods with the intention of uncovering hidden patterns.[17] in big information sets. It bridges the gap from practical statistics and bogus intelligence (which commonly provide the mathematical background) to database management by exploiting the way information is stored and indexed in databases to execute the actual learning and discovery algorithms more than efficiently, allowing such methods to exist practical to ever-larger data sets.

Process [edit]

The noesis discovery in databases (KDD) process is commonly defined with the stages:

  1. Selection
  2. Pre-processing
  3. Transformation
  4. Data mining
  5. Interpretation/evaluation.[5]

It exists, notwithstanding, in many variations on this theme, such as the Cross-manufacture standard process for data mining (Well-baked-DM) which defines half dozen phases:

  1. Business organization agreement
  2. Information understanding
  3. Data preparation
  4. Modeling
  5. Evaluation
  6. Deployment

or a simplified procedure such equally (1) Pre-processing, (2) Information Mining, and (3) Results Validation.

Polls conducted in 2002, 2004, 2007 and 2014 show that the CRISP-DM methodology is the leading methodology used by data miners.[18] The but other data mining standard named in these polls was SEMMA. However, 3–four times as many people reported using CRISP-DM. Several teams of researchers have published reviews of data mining process models,[19] and Azevedo and Santos conducted a comparison of CRISP-DM and SEMMA in 2008.[20]

Pre-processing [edit]

Earlier information mining algorithms can exist used, a target data set up must exist assembled. As data mining tin only uncover patterns actually nowadays in the information, the target data prepare must be big plenty to incorporate these patterns while remaining concise enough to be mined within an adequate fourth dimension limit. A common source for data is a data mart or data warehouse. Pre-processing is essential to analyze the multivariate data sets before information mining. The target set is so cleaned. Information cleaning removes the observations containing noise and those with missing data.

Data mining [edit]

Data mining involves half-dozen common classes of tasks:[5]

  • Bibelot detection (outlier/modify/departure detection) – The identification of unusual data records, that might exist interesting or data errors that require further investigation.
  • Clan rule learning (dependency modeling) – Searches for relationships between variables. For example, a supermarket might gather data on client purchasing habits. Using association rule learning, the supermarket tin make up one's mind which products are frequently bought together and use this data for marketing purposes. This is sometimes referred to as market basket assay.
  • Clustering – is the chore of discovering groups and structures in the data that are in some way or another "similar", without using known structures in the data.
  • Classification – is the chore of generalizing known structure to apply to new data. For example, an e-mail program might attempt to allocate an e-mail as "legitimate" or as "spam".
  • Regression – attempts to find a function that models the information with the least error that is, for estimating the relationships among data or datasets.
  • Summarization – providing a more compact representation of the information set, including visualization and study generation.

Results validation [edit]

An example of data produced past data dredging through a bot operated by statistician Tyler Vigen, patently showing a close link between the all-time word winning a spelling bee competition and the number of people in the U.s.a. killed past venomous spiders. The similarity in trends is obviously a coincidence.

Data mining can unintentionally be misused, and can then produce results that appear to exist significant; but which do non actually predict hereafter behavior and cannot be reproduced on a new sample of data and conduct little apply. Often this results from investigating too many hypotheses and not performing proper statistical hypothesis testing. A unproblematic version of this problem in machine learning is known every bit overfitting, but the same problem can arise at different phases of the procedure and thus a train/examination split—when applicable at all—may not exist sufficient to prevent this from happening.[21]

The terminal pace of knowledge discovery from information is to verify that the patterns produced by the data mining algorithms occur in the wider data set. Not all patterns plant by information mining algorithms are necessarily valid. Information technology is common for data mining algorithms to find patterns in the training set which are not present in the general data prepare. This is called overfitting. To overcome this, the evaluation uses a test gear up of data on which the information mining algorithm was non trained. The learned patterns are applied to this examination set up, and the resulting output is compared to the desired output. For instance, a data mining algorithm trying to distinguish "spam" from "legitimate" emails would exist trained on a training prepare of sample e-mails. Once trained, the learned patterns would exist applied to the test set of e-mails on which it had not been trained. The accurateness of the patterns tin then exist measured from how many e-mails they correctly classify. Several statistical methods may be used to evaluate the algorithm, such equally ROC curves.

If the learned patterns exercise not meet the desired standards, after it is necessary to re-evaluate and change the pre-processing and data mining steps. If the learned patterns exercise meet the desired standards, so the final stride is to interpret the learned patterns and plow them into knowledge.

Research [edit]

The premier professional torso in the field is the Association for Computing Machinery's (ACM) Special Interest Group (SIG) on Knowledge Discovery and Information Mining (SIGKDD).[22] [23] Since 1989, this ACM SIG has hosted an annual international conference and published its proceedings,[24] and since 1999 it has published a biannual bookish periodical titled "SIGKDD Explorations".[25]

Computer science conferences on data mining include:

  • CIKM Conference – ACM Briefing on Information and Knowledge Management
  • European Conference on Auto Learning and Principles and Practice of Knowledge Discovery in Databases
  • KDD Briefing – ACM SIGKDD Briefing on Noesis Discovery and Data Mining

Data mining topics are too present on many data management/database conferences such every bit the ICDE Conference, SIGMOD Conference and International Conference on Very Large Data Bases

Standards [edit]

In that location accept been some efforts to ascertain standards for the data mining process, for example, the 1999 European Cross Industry Standard Process for Data Mining (CRISP-DM 1.0) and the 2004 Java Data Mining standard (JDM ane.0). Development on successors to these processes (CRISP-DM 2.0 and JDM 2.0) was active in 2006 merely has stalled since. JDM 2.0 was withdrawn without reaching a final draft.

For exchanging the extracted models—in particular for utilize in predictive analytics—the cardinal standard is the Predictive Model Markup Language (PMML), which is an XML-based linguistic communication adult past the Data Mining Group (DMG) and supported as exchange format by many data mining applications. As the proper noun suggests, it only covers prediction models, a item information mining task of high importance to business organisation applications. However, extensions to comprehend (for example) subspace clustering have been proposed independently of the DMG.[26]

Notable uses [edit]

Data mining is used wherever there is digital data available today. Notable examples of data mining can exist found throughout business, medicine, science, and surveillance.

Privacy concerns and ethics [edit]

While the term "data mining" itself may take no ethical implications, it is ofttimes associated with the mining of information in relation to peoples' behavior (ethical and otherwise).[27]

The ways in which information mining tin can be used can in some cases and contexts raise questions regarding privacy, legality, and ethics.[28] In detail, data mining government or commercial data sets for national security or constabulary enforcement purposes, such as in the Total Information Awareness Programme or in Suggest, has raised privacy concerns.[29] [30]

Data mining requires data preparation which uncovers data or patterns which compromise confidentiality and privacy obligations. A common style for this to occur is through information assemblage. Data assemblage involves combining data together (possibly from diverse sources) in a way that facilitates analysis (but that as well might make identification of individual, private-level information deducible or otherwise credible).[31] This is not data mining per se, but a result of the preparation of information before—and for the purposes of—the analysis. The threat to an individual's privacy comes into play when the data, once compiled, cause the information miner, or anyone who has admission to the newly compiled information ready, to be able to identify specific individuals, especially when the data were originally bearding.[32]

It is recommended[ according to whom? ] to be aware of the following before data are collected:[31]

  • The purpose of the data collection and any (known) data mining projects;
  • How the data will be used;
  • Who will be able to mine the data and use the data and their derivatives;
  • The status of security surrounding access to the data;
  • How collected data can be updated.

Data may also exist modified then equally to become anonymous, so that individuals may not readily be identified.[31] However, fifty-fifty "anonymized" information sets can potentially contain enough information to allow identification of individuals, every bit occurred when journalists were able to detect several individuals based on a set of search histories that were inadvertently released past AOL.[33]

The inadvertent revelation of personally identifiable information leading to the provider violates Fair Information Practices. This indiscretion can cause financial, emotional, or actual harm to the indicated individual. In i case of privacy violation, the patrons of Walgreens filed a lawsuit confronting the company in 2011 for selling prescription information to data mining companies who in turn provided the data to pharmaceutical companies.[34]

Situation in Europe [edit]

Europe has rather strong privacy laws, and efforts are underway to farther strengthen the rights of the consumers. However, the U.S.–East.U. Safe Harbor Principles, developed betwixt 1998 and 2000, currently effectively expose European users to privacy exploitation by U.Due south. companies. Equally a consequence of Edward Snowden'due south global surveillance disclosure, there has been increased discussion to revoke this agreement, as in particular the data will be fully exposed to the National Security Bureau, and attempts to attain an understanding with the United States accept failed.[35]

In the United Kingdom in particular at that place have been cases of corporations using data mining as a way to target certain groups of customers forcing them to pay unfairly high prices. These groups tend to be people of lower socio-economic condition who are not savvy to the ways they can be exploited in digital market place places.[36]

Situation in the Us [edit]

In the United States, privacy concerns have been addressed by the United states Congress via the passage of regulatory controls such as the Health Insurance Portability and Accountability Act (HIPAA). The HIPAA requires individuals to give their "informed consent" regarding information they provide and its intended present and hereafter uses. Co-ordinate to an article in Biotech Business Calendar week, "'[i]n exercise, HIPAA may non offer whatsoever greater protection than the longstanding regulations in the inquiry arena,' says the AAHC. More importantly, the rule's goal of protection through informed consent is approach a level of incomprehensibility to boilerplate individuals."[37] This underscores the necessity for data anonymity in information aggregation and mining practices.

U.Due south. information privacy legislation such as HIPAA and the Family Educational Rights and Privacy Deed (FERPA) applies just to the specific areas that each such law addresses. The use of data mining by the majority of businesses in the U.Due south. is not controlled by whatsoever legislation.

Copyright constabulary [edit]

Situation in Europe [edit]

Under European copyright and database laws, the mining of in-copyright works (such every bit by web mining) without the permission of the copyright owner is not legal. Where a database is pure information in Europe, it may be that there is no copyright—but database rights may exist and so data mining becomes subject to intellectual property owners' rights that are protected past the Database Directive. On the recommendation of the Hargreaves review, this led to the UK government to amend its copyright police in 2014 to permit content mining as a limitation and exception.[38] The United kingdom was the second land in the world to do then after Nippon, which introduced an exception in 2009 for data mining. However, due to the restriction of the Information Society Directive (2001), the Britain exception but allows content mining for non-commercial purposes. UK copyright law also does not allow this provision to be overridden by contractual terms and conditions. Since 2020 also Switzerland has been regulating data mining by allowing it in the inquiry field under sure conditions laid down past fine art. 24d of the Swiss Copyright Act. This new article entered into force on 1 April 2020.[39]

The European Commission facilitated stakeholder discussion on text and data mining in 2013, under the championship of Licences for Europe.[40] The focus on the solution to this legal issue, such as licensing rather than limitations and exceptions, led to representatives of universities, researchers, libraries, civil society groups and open admission publishers to leave the stakeholder dialogue in May 2013.[41]

Situation in the United States [edit]

United states of america copyright law, and in item its provision for fair use, upholds the legality of content mining in America, and other fair use countries such as Israel, Taiwan and South korea. As content mining is transformative, that is it does non supplant the original work, information technology is viewed as being lawful under fair utilize. For instance, as part of the Google Volume settlement the presiding approximate on the case ruled that Google'south digitization project of in-copyright books was lawful, in office because of the transformative uses that the digitization project displayed—i beingness text and data mining.[42]

Software [edit]

Free open-source data mining software and applications [edit]

The post-obit applications are available under free/open-source licenses. Public access to awarding source code is as well available.

  • Carrot2: Text and search results clustering framework.
  • Chemicalize.org: A chemical structure miner and spider web search engine.
  • ELKI: A university research project with advanced cluster analysis and outlier detection methods written in the Java language.
  • GATE: a natural language processing and language engineering tool.
  • KNIME: The Konstanz Information Miner, a user-friendly and comprehensive data analytics framework.
  • Massive Online Analysis (MOA): a real-time big data stream mining with concept drift tool in the Java programming linguistic communication.
  • MEPX: cross-platform tool for regression and classification problems based on a Genetic Programming variant.
  • mlpack: a collection of ready-to-use machine learning algorithms written in the C++ language.
  • NLTK (Natural Language Toolkit): A suite of libraries and programs for symbolic and statistical natural language processing (NLP) for the Python language.
  • OpenNN: Open up neural networks library.
  • Orange: A component-based data mining and motorcar learning software suite written in the Python linguistic communication.
  • PSPP: Information mining and statistics software under the GNU Project similar to SPSS
  • R: A programming language and software environment for statistical computing, data mining, and graphics. It is role of the GNU Project.
  • Scikit-acquire: an open-source automobile learning library for the Python programming linguistic communication
  • Torch: An open-source deep learning library for the Lua programming language and scientific computing framework with broad support for automobile learning algorithms.
  • UIMA: The UIMA (Unstructured Data Management Compages) is a component framework for analyzing unstructured content such every bit text, audio and video – originally adult by IBM.
  • Weka: A suite of machine learning software applications written in the Java programming language.

Proprietary information-mining software and applications [edit]

The post-obit applications are bachelor under proprietary licenses.

  • Angoss KnowledgeSTUDIO: data mining tool
  • LIONsolver: an integrated software application for information mining, business intelligence, and modeling that implements the Learning and Intelligent OptimizatioN (LION) approach.
  • PolyAnalyst: information and text mining software by Megaputer Intelligence.
  • Microsoft Analysis Services: information mining software provided by Microsoft.
  • NetOwl: suite of multilingual text and entity analytics products that enable data mining.
  • Oracle Data Mining: data mining software by Oracle Corporation.
  • PSeven: platform for automation of engineering simulation and assay, multidisciplinary optimization and data mining provided past DATADVANCE.
  • Qlucore Omics Explorer: information mining software.
  • RapidMiner: An environment for machine learning and data mining experiments.
  • SAS Enterprise Miner: data mining software provided by the SAS Institute.
  • SPSS Modeler: data mining software provided past IBM.
  • STATISTICA Data Miner: data mining software provided by StatSoft.
  • Tanagra: Visualisation-oriented data mining software, as well for education.
  • Vertica: data mining software provided by Hewlett-Packard.
  • Google Deject Platform: automated custom ML models managed by Google.
  • Amazon SageMaker: managed service provided by Amazon for creating & productionising custom ML models.

See also [edit]

Methods
  • Agent mining
  • Bibelot/outlier/change detection
  • Association dominion learning
  • Bayesian networks
  • Classification
  • Cluster analysis
  • Decision trees
  • Ensemble learning
  • Gene analysis
  • Genetic algorithms
  • Intention mining
  • Learning classifier arrangement
  • Multilinear subspace learning
  • Neural networks
  • Regression assay
  • Sequence mining
  • Structured data analysis
  • Support vector machines
  • Text mining
  • Fourth dimension series analysis
Application domains
  • Analytics
  • Behavior informatics
  • Big data
  • Bioinformatics
  • Business intelligence
  • Data analysis
  • Data warehouse
  • Decision support organization
  • Domain driven information mining
  • Drug discovery
  • Exploratory information assay
  • Predictive analytics
  • Web mining
Application examples
  • Automatic number plate recognition in the United Kingdom
  • Customer analytics
  • Educational information mining
  • National Security Agency
  • Quantitative structure–activity relationship
  • Surveillance / Mass surveillance (e.yard., Stellar Wind)
Related topics

For more than information about extracting data out of data (as opposed to analyzing information), run across:

  • Data integration
  • Data transformation
  • Electronic discovery
  • Information extraction
  • Data integration
  • Named-entity recognition
  • Profiling (computer science)
  • Psychometrics
  • Social media mining
  • Surveillance capitalism
  • Web scraping
Other resources
  • International Journal of Data Warehousing and Mining

References [edit]

  1. ^ a b c "Data Mining Curriculum". ACM SIGKDD. 2006-04-30. Retrieved 2014-01-27 .
  2. ^ Clifton, Christopher (2010). "Encyclopædia Britannica: Definition of Data Mining". Retrieved 2010-12-09 .
  3. ^ Hastie, Trevor; Tibshirani, Robert; Friedman, Jerome (2009). "The Elements of Statistical Learning: Data Mining, Inference, and Prediction". Archived from the original on 2009-11-10. Retrieved 2012-08-07 .
  4. ^ Han, Jaiwei; Kamber, Micheline; Pei, Jian (2011). Information Mining: Concepts and Techniques (tertiary ed.). Morgan Kaufmann. ISBN978-0-12-381479-1.
  5. ^ a b c Fayyad, Usama; Piatetsky-Shapiro, Gregory; Smyth, Padhraic (1996). "From Data Mining to Noesis Discovery in Databases" (PDF) . Retrieved 17 December 2008.
  6. ^ Han, Jiawei; Kamber, Micheline (2001). Information mining: concepts and techniques. Morgan Kaufmann. p. 5. ISBN978-1-55860-489-six. Thus, data mining should have been more appropriately named "knowledge mining from data," which is unfortunately somewhat long
  7. ^ OKAIRP 2005 Fall Briefing, Arizona State University Archived 2014-02-01 at the Wayback Machine
  8. ^ Witten, Ian H.; Frank, Eibe; Hall, Mark A. (2011). Information Mining: Practical Auto Learning Tools and Techniques (iii ed.). Elsevier. ISBN978-0-12-374856-0.
  9. ^ Bouckaert, Remco R.; Frank, Eibe; Hall, Mark A.; Holmes, Geoffrey; Pfahringer, Bernhard; Reutemann, Peter; Witten, Ian H. (2010). "WEKA Experiences with a Java open up-source project". Journal of Machine Learning Enquiry. eleven: 2533–2541. the original title, "Applied machine learning", was changed ... The term "data mining" was [added] primarily for marketing reasons.
  10. ^ Olson, D. Fifty. (2007). Data mining in business services. Service Business, 1(3), 181–193. doi:ten.1007/s11628-006-0014-7
  11. ^ Lovell, Michael C. (1983). "Data Mining". The Review of Economics and Statistics. 65 (i): 1–12. doi:10.2307/1924403. JSTOR 1924403.
  12. ^ Charemza, Wojciech W.; Deadman, Derek F. (1992). "Data Mining". New Directions in Econometric Do. Aldershot: Edward Elgar. pp. 14–31. ISBNane-85278-461-X.
  13. ^ Mena, Jesús (2011). Machine Learning Forensics for Constabulary Enforcement, Security, and Intelligence. Boca Raton, FL: CRC Press (Taylor & Francis Grouping). ISBN978-1-4398-6069-four.
  14. ^ Piatetsky-Shapiro, Gregory; Parker, Gary (2011). "Lesson: Data Mining, and Knowledge Discovery: An Introduction". Introduction to Information Mining. KD Nuggets. Retrieved thirty Baronial 2012.
  15. ^ Fayyad, Usama (15 June 1999). "Kickoff Editorial by Editor-in-Chief". SIGKDD Explorations. xiii (1): 102. doi:10.1145/2207243.2207269. S2CID 13314420. Retrieved 27 Dec 2010.
  16. ^ Coenen, Frans (2011-02-07). "Data mining: past, present and hereafter". The Noesis Engineering Review. 26 (1): 25–29. doi:10.1017/S0269888910000378. ISSN 0269-8889. S2CID 6487637.
  17. ^ Kantardzic, Mehmed (2003). Data Mining: Concepts, Models, Methods, and Algorithms . John Wiley & Sons. ISBN978-0-471-22852-three. OCLC 50055336.
  18. ^ Gregory Piatetsky-Shapiro (2002) KDnuggets Methodology Poll, Gregory Piatetsky-Shapiro (2004) KDnuggets Methodology Poll, Gregory Piatetsky-Shapiro (2007) KDnuggets Methodology Poll, Gregory Piatetsky-Shapiro (2014) KDnuggets Methodology Poll
  19. ^ Lukasz Kurgan and Petr Musilek: "A survey of Knowledge Discovery and Data Mining process models". The Knowledge Engineering Review. Book 21 Effect 1, March 2006, pp i–24, Cambridge University Printing, New York, doi:10.1017/S0269888906000737
  20. ^ Azevedo, A. and Santos, M. F. KDD, SEMMA and Well-baked-DM: a parallel overview Archived 2013-01-09 at the Wayback Machine. In Proceedings of the IADIS European Conference on Data Mining 2008, pp 182–185.
  21. ^ Hawkins, Douglas M (2004). "The trouble of overfitting". Periodical of Chemical Information and Computer Sciences. 44 (1): ane–12. doi:10.1021/ci0342472. PMID 14741005.
  22. ^ "Microsoft Academic Search: Superlative conferences in data mining". Microsoft Academic Search.
  23. ^ "Google Scholar: Acme publications - Data Mining & Analysis". Google Scholar.
  24. ^ Proceedings Archived 2010-04-xxx at the Wayback Automobile, International Conferences on Noesis Discovery and Data Mining, ACM, New York.
  25. ^ SIGKDD Explorations, ACM, New York.
  26. ^ Günnemann, Stephan; Kremer, Hardy; Seidl, Thomas (2011). "An extension of the PMML standard to subspace clustering models". Proceedings of the 2011 workshop on Predictive markup language modeling. p. 48. doi:10.1145/2023598.2023605. ISBN978-1-4503-0837-3. S2CID 14967969.
  27. ^ Seltzer, William (2005). "The Promise and Pitfalls of Data Mining: Ethical Issues" (PDF). ASA Section on Regime Statistics. American Statistical Association.
  28. ^ Pitts, Chip (15 March 2007). "The End of Illegal Domestic Spying? Don't Count on It". Washington Spectator. Archived from the original on 2007-xi-28.
  29. ^ Taipale, Kim A. (15 December 2003). "Information Mining and Domestic Security: Connecting the Dots to Brand Sense of Data". Columbia Science and Technology Law Review. five (2). OCLC 45263753. SSRN 546782.
  30. ^ Resig, John. "A Framework for Mining Instant Messaging Services" (PDF) . Retrieved xvi March 2018.
  31. ^ a b c Think Before Yous Dig: Privacy Implications of Data Mining & Aggregation Archived 2008-12-17 at the Wayback Machine, NASCIO Research Cursory, September 2004
  32. ^ Ohm, Paul. "Don't Build a Database of Ruin". Harvard Business Review.
  33. ^ AOL search data identified individuals, SecurityFocus, August 2006
  34. ^ Kshetri, Nir (2014). "Big data׳s impact on privacy, security and consumer welfare" (PDF). Telecommunications Policy. 38 (eleven): 1134–1145. doi:10.1016/j.telpol.2014.10.002.
  35. ^ Weiss, Martin A.; Archick, Kristin (xix May 2016). "U.S.–Eastward.U. Data Privacy: From Safety Harbor to Privacy Shield" (PDF). Washington, D.C. Congressional Research Service. p. half-dozen. R44257. Retrieved nine April 2020. On October vi, 2015, the CJEU ... issued a decision that invalidated Safe Harbor (effective immediately), as currently implemented.
  36. ^ Parker, George. "U.k. Companies Targeted for Using Big Data to Exploit Customers." Subscribe to Read | Financial Times, Financial Times, 30 Sept. 2018, https://www.ft.com/content/5dbd98ca-c491-11e8-bc21-54264d1c4647.
  37. ^ Biotech Business organization Week Editors (June 30, 2008); BIOMEDICINE; HIPAA Privacy Rule Impedes Biomedical Research, Biotech Business organization Week, retrieved 17 November 2009 from LexisNexis Academic
  38. ^ UK Researchers Given Information Mining Right Under New UK Copyright Laws. Archived June 9, 2014, at the Wayback Machine Out-Police.com. Retrieved fourteen November 2014
  39. ^ "Fedlex".
  40. ^ "Licences for Europe – Structured Stakeholder Dialogue 2013". European Commission . Retrieved 14 Nov 2014.
  41. ^ "Text and Information Mining:Its importance and the need for change in Europe". Association of European Research Libraries . Retrieved 14 November 2014.
  42. ^ "Approximate grants summary judgment in favor of Google Books – a fair utilise victory". Lexology.com. Antonelli Police force Ltd. nineteen November 2013. Retrieved 14 November 2014.

Farther reading [edit]

  • Cabena, Peter; Hadjnian, Pablo; Stadler, Rolf; Verhees, Jaap; Zanasi, Alessandro (1997); Discovering Data Mining: From Concept to Implementation, Prentice Hall, ISBN 0-13-743980-6
  • M.S. Chen, J. Han, P.Southward. Yu (1996) "Data mining: an overview from a database perspective". Cognition and information Engineering, IEEE Transactions on 8 (6), 866–883
  • Feldman, Ronen; Sanger, James (2007); The Text Mining Handbook, Cambridge University Printing, ISBN 978-0-521-83657-9
  • Guo, Yike; and Grossman, Robert (editors) (1999); High Functioning Data Mining: Scaling Algorithms, Applications and Systems, Kluwer Bookish Publishers
  • Han, Jiawei, Micheline Kamber, and Jian Pei. Data mining: concepts and techniques. Morgan kaufmann, 2006.
  • Hastie, Trevor, Tibshirani, Robert and Friedman, Jerome (2001); The Elements of Statistical Learning: Information Mining, Inference, and Prediction, Springer, ISBN 0-387-95284-5
  • Liu, Bing (2007, 2011); Web Data Mining: Exploring Hyperlinks, Contents and Usage Information, Springer, ISBN three-540-37881-ii
  • Potato, Chris (sixteen May 2011). "Is Data Mining Complimentary Speech?". InformationWeek: 12.
  • Nisbet, Robert; Elder, John; Miner, Gary (2009); Handbook of Statistical Analysis & Data Mining Applications, Academic Press/Elsevier, ISBN 978-0-12-374765-5
  • Poncelet, Pascal; Masseglia, Florent; and Teisseire, Maguelonne (editors) (October 2007); "Data Mining Patterns: New Methods and Applications", Information Science Reference, ISBN 978-1-59904-162-9
  • Tan, Pang-Ning; Steinbach, Michael; and Kumar, Vipin (2005); Introduction to Data Mining, ISBN 0-321-32136-7
  • Theodoridis, Sergios; and Koutroumbas, Konstantinos (2009); Blueprint Recognition, 4th Edition, Academic Press, ISBN 978-1-59749-272-0
  • Weiss, Sholom M.; and Indurkhya, Nitin (1998); Predictive Data Mining, Morgan Kaufmann
  • Witten, Ian H.; Frank, Eibe; Hall, Mark A. (thirty Jan 2011). Data Mining: Practical Machine Learning Tools and Techniques (3 ed.). Elsevier. ISBN978-0-12-374856-0. (See too Costless Weka software)
  • Ye, Nong (2003); The Handbook of Data Mining, Mahwah, NJ: Lawrence Erlbaum

External links [edit]

  • Knowledge Discovery Software at Curlie
  • Information Mining Tool Vendors at Curlie

Which Of The Following Statements Is True Of Unsupervised Data Mining?,

Source: https://en.wikipedia.org/wiki/Data_mining

Posted by: martintrathem2001.blogspot.com

0 Response to "Which Of The Following Statements Is True Of Unsupervised Data Mining?"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel