国际象棋(王Rook对战骑士):骑士
38. Chess (King-Rook vs. King-Pawn): King+Rook versus King+Pawn on a7 (usually abbreviated KRKPA7).
国王Rook与国王Pawn的a7(通常简写为KAEPA7)。 39. Cloud: Little Documentation 小文档。
40. CMU Face Images: This data consists of 640 black and white face images of people taken with varying pose (straight, left, right, up), expression (neutral, happy, sad, angry), eyes (wearing sunglasses or not), and size
CMU人脸图像DataSet:该数据集包含了640张黑白人脸图像,并且有直、左、右、上四个角度,中性、高兴、悲伤、生气四个表情,有的戴着太阳镜,有的没有,并且大小也不一。
41. Coil 1999 Competition Data: This data set is from the 1999
Computational Intelligence and Learning (COIL) competition. The data
contains measurements of river chemical concentrations and algae densities. Coil1999竞赛数据:该数据集来自1999年的计算机智能学习竞赛(简写为Coil)。该数据集包含了河流的化学浓度度量和藻类的密度度量。
42. Communities and Crime: Communities within the United States. The data combines socio-economic data from the 1990 US Census, law enforcement data from the 1990 US LEMAS survey, and crime data from the 1995 FBI UCR.
社区与犯罪DataSet:美国的社区。该数据集包含了来自1990美国普查的社会经济数据、来自1990美国LEMAS调查的法律实施数据,还有来自1995年FBI UCR的犯罪数据。
43. Communities and Crime Unnormalized: Communities in the US. Data combines socio-economic data from the '90 Census, law enforcement data from the 1990 Law Enforcement Management and Admin Stats survey, and crime data from the 1995 FBI UCR
社区和非标准化犯罪DataSet:美国的社区。数据包含了来自90年代普查的社会经济数据、来自1990年法律实施管理调查的法律实施数据,还有来自1995年FBI UCR的犯罪数据。
44. Computer Hardware: Relative CPU Performance Data, described in
terms of its cycle time, memory size, etc.
计算机硬件:相关CPU运行数据,采用它的时间周期、内存大小来描述。 45. Concrete Compressive Strength: Concrete is the most important material in civil engineering. The concrete compressive strength is a highly nonlinear function of age and ingredients.
混凝土抗压强度DataSet:混凝土是土木工程中最重要的材料。抗压强度是混凝土年龄与组成非线性特征。
46. Concrete Slump Test: Concrete is a highly complex material. The slump flow of concrete is not only determined by the water content, but that is also influenced by other concrete ingredients.
混凝土塌方度试验:混凝土是一种非常复杂的材料。它的塌落度流量不仅取决于含水量,也受其他具体成分的影响。
47. Congressional Voting Records: 1984 United Stated Congressional Voting Records; Classify as Republican or Democrat
国会投票记录DataSet:1984年美国国会投票记录;按照共和党与民主党分类。 48. Connect-4: Contains connect-4 positions 连接4:包含了连接4的位置。
49. Connectionist Bench (Nettalk Corpus): The file \list of 20,008 English words, along with a phonetic transcription for each word. The task is to train a network to produce the proper phonemes
连接工作台(Nettalk资料库):文件“nettalk.data”包含了一个有20008个英语单词的列表,还有一个每个单词的phonetic副本。任务是训练一个网络,用来产生适当的phonemes。
50. Connectionist Bench (Sonar, Mines vs. Rocks): The task is to train a network to discriminate between sonar signals bounced off a metal cylinder and those bounced off a roughly cylindrical rock.
连接工作台(声纳、矿产和岩石):目标是训练一个网络,用来区别在金属圆柱体的反弹声纳信号,和在基本为圆柱体的岩石上的反弹信号。
51. Connectionist Bench (Vowel Recognition - Deterding Data): Speaker independent recognition of the eleven steady state vowels of British English
using a specified training set of lpc derived log area ratios.
连接工作台(元音识别—Detering数据):使用一个来源于一个比率的指定训练集的11个英式英语的稳定元音字母的独立识别扬声器。
52. Contraceptive Method Choice: Dataset is a subset of the 1987 National Indonesia Contraceptive Prevalence Survey.
避孕方法的选择:该数据集是1997年印度尼西亚全国的避孕患病率调查的的一个子集。
53. Corel Image Features: This dataset contains image features extracted from a Corel image collection. Four sets of features are available based on the color histogram, color histogram layout, color moments, and co-occurrence Corel图像特征:该数据集包含了提取自一个Corel图像集合的图片特征。基于颜色直方图、颜色直方图布局、颜色的时机和调和,可得到四个特征集合。 54. Covertype: Forest CoverType dataset 覆盖类型:森林覆盖类型数据集。
55. Credit Approval: This data concerns credit card applications; good mix of attributes
信贷审批:该数据集与信用卡的使用相关;是各种属性的集合。
56. Cylinder Bands: Used in decision tree induction for mitigating process delays known as \气缸带:使用判定树来归纳,减缓气缸带的凸版打印。
57. Demospongiae: Marine sponges of the Demospongiae class classification domain.
Demospongiae类别下的海绵分类域。
58. Dermatology: Aim for this dataset is to determine the type of Eryhemato-Squamous Disease.
皮肤科:该数据集用于判定Eryhemato鳞状疾病的类型。
59. Dexter: DEXTER is a text classification problem in a bag-of-word representation. This is a two-class classification problem with sparse
continuous input variables. This dataset is one of five datasets of the NIPS
2003 feature selection challenge.
DETEX是一个用一个文字包来表现的文本分类问题。这是一个通过不断的输入参数的两层的分类问题。该数据集是NIPS2003年特征提取邀请赛的五个数据集中的一个。
60. DGP2 - The Second Data Generation Program: Generates application domains based on specific parameters, number of features, and proportion of positive to negative examples
DGP2—第二个数据生成程序:基于具体的参数、特征的数量、和正面到负面例子的比率,产生应用域。
61. Diabetes: This diabetes dataset is from AIM '94 糖尿病:该糖尿病数据集来自AIM94。
62. Document Understanding: Five concepts, expressed as predicates, to be learned
文件理解:要学习的五个概念,作为谓词来表现。
63. Dodgers Loop Sensor: Loop sensor data was collected for the Glendale on ramp for the 101 North freeway in Los Angeles
Dodgers回路传感器:回路传感器数据集来自Gledale的斜坡(在洛杉矶的101个北高速公路)。
64. Dorothea: DOROTHEA is a drug discovery dataset. Chemical compounds represented by structural molecular features must be classified as active (binding to thrombin) or inactive. This is one of 5 datasets of the NIPS 2003 feature selection challenge.
Dorothea是一个药物发现数据集。以结构分析特征来表现的化合物必须分类为活性的(绑定到凝血酶)或者非活性的。这是五个NIPS2003特征选择挑战赛数据集中的一个。
65. E. Coli Genes: Data giving characteristics of each ORF (potential gene) in the E. coli genome. Sequence, homology (similarity to other genes) and structural information, and function (if known) are provided.
大肠杆菌基因:每个在E.coli基因组里面ORD(潜在基因)的特征数据集。提供序列、同源性(与其他基因的相似形)和结构信息。还有功能(如果知道的话)。
66. EBL Domain Theories: Assorted small-scale domain theories EBL域理论:各种小规模的域理论。
67. Echocardiogram: Data for classifying if patients will survive for at least one year after a heart attack
超声心动图:该数据集用来分类是否病人在一次心脏病后,至少可以存活一年。 68. Ecoli: This data contains protein localization sites 该数据集包含了蛋白质本地化地址。
69. Economic Sanctions: Domain Theory on Economic Sanctions; Undocumented
经济制裁:经济制裁方面的域理论,无记录文档。
70. EEG Database: This data arises from a large study to examine EEG correlates of genetic predisposition to alcoholism. It contains measurements from 64 electrodes placed on the scalp sampled at 256 Hz
EEG数据库:该数据集来源于一个检查EEG的、与易患酒精中毒的基因体质相关的大型研究、包含了放在头皮上的、为256HZ的、来自64个电极的度量。 71. El Nino: The data set contains oceanographic and surface meteorological readings taken from a series of buoys positioned throughout the equatorial Pacific.
厄尔尼诺:该数据集包含了从整个赤道太平洋的一系列浮标的海洋与地面气象读数。
72. Entree Chicago Recommendation Data: This data contains a record of user interactions with the Entree Chicago restaurant recommendation system. 芝加哥主菜推荐数据:该数据集包含了一个与芝加哥主菜馆的推荐系统的用户交互的记录。
73. Flags: From Collins Gem Guide to Flags, 1986
标志:从柯林斯宝石指南的标志,1986
74. Forest Fires: This is a difficult regression task, where the aim is to predict the burned area of forest fires, in the northeast region of Portugal, by using meteorological and other data (see details at: