RESEARCH
RESEARCHARTICLES◥newconcept,andevenchildrencanmakemean-ingfulgeneralizationsvia“one-shotlearning”(1–3).Incontrast,manyoftheleadingapproachesCOGNITIVESCIENCEinmachinelearningarealsothemostdata-hungry,especially“deeplearning”modelsthathaveHuman-levelconceptlearningachievednewlevelsofperformanceonobjectandspeechrecognitionbenchmarks(4–9).Sec-ond,peoplelearnricherrepresentationsthanthroughprobabilisticmachinesdo,evenforsimpleconcepts(Fig.1B),usingthemforawiderrangeoffunctions,in-programinductioncluding(Fig.1,ii)creatingnewexemplars(10),(Fig.1,iii)parsingobjectsintopartsandrela-tions(11),and(Fig.1,iv)creatingnewabstractcategoriesofobjectsbasedonexistingcategoriesBrendenM.Lake,1*RuslanSalakhutdinov,2JoshuaB.Tenenbaum3(12,13).Incontrast,thebestmachineclassifiersdonotperformtheseadditionalfunctions,whichPeoplelearningnewconceptscanoftengeneralizesuccessfullyfromjustasingleexample,arerarelystudiedandusuallyrequirespecial-yetmachinelearningalgorithmstypicallyrequiretensorhundredsofexamplestoizedalgorithms.Acentralchallengeistoex-performwithsimilaraccuracy.Peoplecanalsouselearnedconceptsinricherwaysthanplainthesetwoaspectsofhuman-levelconceptconventionalalgorithms—foraction,imagination,andexplanation.Wepresentalearning:Howdopeoplelearnnewconceptscomputationalmodelthatcapturesthesehumanlearningabilitiesforalargeclassoffromjustoneorafewexamples?Andhowdosimplevisualconcepts:handwrittencharactersfromtheworld’salphabets.Themodelpeoplelearnsuchabstract,rich,andflexiblerep-representsconceptsassimpleprogramsthatbestexplainobservedexamplesunderaresentations?AnevengreaterchallengearisesBayesiancriterion.Onachallengingone-shotclassificationtask,themodelachieveswhenputtingthemtogether:Howcanlearninghuman-levelperformancewhileoutperformingrecentdeeplearningapproaches.Wealsosucceedfromsuchsparsedatayetalsoproducepresentseveral“visualTuringtests”probingthemodel’screativegeneralizationabilities,suchrichrepresentations?Foranytheoryofwhichinmanycasesareindistinguishablefromhumanbehavior.D1espiteremarkableadvancesinartificialfromjustoneorahandfulofexamples,whereasCenterforDataScience,NewYorkUniversity,726Broadway,NewYork,NY10003,USA.2Departmentofintelligenceandmachinelearning,twostandardalgorithmsinmachinelearningrequireComputerScienceandDepartmentofStatistics,Universityaspectsofhumanconceptualknowledgetensorhundredsofexamplestoperformsimi-ofToronto,6King’sCollegeRoad,Toronto,ONM5S3G4,haveeludedmachinesystems.First,forlarly.Forinstance,peoplemayonlyneedtoseeCanada.3DepartmentofBrainandCognitiveSciences,mostinterestingkindsofnaturalandman-oneexampleofanoveltwo-wheeledvehicleMassachusettsInstituteofTechnology,77MassachusettsAvenue,Cambridge,MA02139,USA.madecategories,peoplecanlearnanewconcept(Fig.1A)inordertograsptheboundariesofthe*Correspondingauthor.E-mail:brenden@nyu.eduFig.1.Peoplecanlearnrichconceptsfromlimiteddata.(AandB)Asingleexampleofanewconcept(redboxes)canbeenoughinformationtosupportthe(i)classificationofnewexamples,(ii)generationofnewexamples,(iii)parsinganobjectintopartsandrelations(partssegmentedbycolor),and(iv)generationofnewconceptsfromrelatedconcepts.[Imagecreditfor(A),iv,bottom:WithpermissionfromGlennRobertsandMotorcycleMojoMagazine]1332
11DECEMBER2015?VOL350ISSUE6266sciencemag.orgSCIENCE
Downloaded from www.sciencemag.org on December 11, 2015RESEARCH|RESEARCHARTICLES
learning(4,14–16),fittingamorecomplicatedmodelrequiresmoredata,notless,inordertoachievesomemeasureofgoodgeneralization,usuallythedifferenceinperformancebetweennewandoldexamples.Nonetheless,peopleseemtonavigatethistrade-offwithremarkableagil-ity,learningrichconceptsthatgeneralizewellfromsparsedata.
ThispaperintroducestheBayesianprogramlearning(BPL)framework,capableoflearningalargeclassofvisualconceptsfromjustasingleexampleandgeneralizinginwaysthataremostlyindistinguishablefrompeople.Conceptsarerep-resentedassimpleprobabilisticprograms—thatis,probabilisticgenerativemodelsexpressedasstructuredproceduresinanabstractdescriptionlanguage(17,18).Ourframeworkbringstogetherthreekeyideas—compositionality,causality,andlearningtolearn—thathavebeenseparatelyinflu-entialincognitivescienceandmachinelearningoverthepastseveraldecades(19–22).Aspro-grams,richconceptscanbebuilt“composition-ally”fromsimplerprimitives.Theirprobabilisticsemanticshandlenoiseandsupportcreativegeneralizationsinaproceduralformthat(unlikeotherprobabilisticmodels)naturallycapturestheabstract“causal”structureofthereal-worldprocessesthatproduceexamplesofacategory.LearningproceedsbyconstructingprogramsthatbestexplaintheobservationsunderaBayesiancriterion,andthemodel“learnstolearn”(23,24)bydevelopinghierarchicalpriorsthatallowpre-viousexperiencewithrelatedconceptstoeaselearningofnewconcepts(25,26).Thesepriorsrepresentalearnedinductivebias(27)thatab-stractsthekeyregularitiesanddimensionsofvariationholdingacrossbothtypesofconceptsandacrossinstances(ortokens)ofaconceptinagivendomain.Inshort,BPLcanconstructnewprogramsbyreusingthepiecesofexistingones,capturingthecausalandcompositionalproper-tiesofreal-worldgenerativeprocessesoperatingonmultiplescales.
Inadditiontodevelopingtheapproachsketchedabove,wedirectlycomparedpeople,BPL,andothercomputationalapproachesonasetoffivechallengingconceptlearningtasks(Fig.1B).ThetasksusesimplevisualconceptsfromOmniglot,adatasetwecollectedofmultipleexamplesof1623handwrittencharactersfrom50writingsystems(Fig.2)(seeacknowledgments).Bothim-agesandpenstrokeswerecollected(seebelow)asdetailedinsectionS1oftheonlinesupplementarymaterials.Handwrittencharactersarewellsuitedforcomparinghumanandmachinelearningonarelativelyevenfooting:Theyarebothcognitivelynaturalandoftenusedasabenchmarkforcom-paringlearningalgorithms.Whereasmachinelearningalgorithmsaretypicallyevaluatedafterhundredsorthousandsoftrainingexamplesperclass(5),weevaluatedthetasksofclassification,parsing(Fig.1B,iii),andgeneration(Fig.1B,ii)ofnewexamplesintheirmostchallengingform:afterjustoneexampleofanewconcept.Wealsoin-vestigatedmorecreativetasksthataskedpeopleandcomputationalmodelstogeneratenewconcepts(Fig.1B,iv).BPLwascomparedwiththreedeeplearningmodels,aclassicpatternrecognitionalgorithm,andvariouslesionedversionsofthemodel—abreadthofcomparisonsthatservetoisolatetheroleofeachmodelingingredient(seesectionS4fordescriptionsofalternativemodels).Wecomparewithtwovarietiesofdeepconvo-lutionalnetworks(28),representativeofthecur-rentleadingapproachestoobjectrecognition(7),andahierarchicaldeep(HD)model(29),aprob-abilisticmodelneededforourmoregenerativetasksandspecializedforone-shotlearning.BayesianProgramLearning
TheBPLapproachlearnssimplestochasticpro-gramstorepresentconcepts,buildingthemcom-positionallyfromparts(Fig.3A,iii),subparts(Fig.3A,ii),andspatialrelations(Fig.3A,iv).BPLdefinesagenerativemodelthatcansam-plenewtypesofconcepts(an“A,”“B,”etc.)bycombiningpartsandsubpartsinnewways.Eachnewtypeisalsorepresentedasagenera-tivemodel,andthislower-levelgenerativemodelproducesnewexamples(ortokens)ofthecon-cept(Fig.3A,v),makingBPLagenerativemodelforgenerativemodels.Thefinalsteprendersthetoken-levelvariablesintheformatoftherawdata(Fig.3A,vi).Thejointdistributionontypesy,asetofMtokensofthattypeq(1),...,q(M),andthecorrespondingbinaryimagesI(1),...,I(M)factorsas
Pey;qe1T;…;qeMT;Ie1T;…;IeMTT?PeyT∏PeIemTjqemTTPeqemTjyT
m?1M
e1T
ThegenerativeprocessfortypesP(y)andtokensP(q(m)|y)aredescribedbythepseudocodeinFig.3BanddetailedalongwiththeimagemodelP(I(m)|q(m))insectionS2.Sourcecodeisavailableonline(seeacknowledgments).Themodellearnstolearnbyfittingeachcondition-aldistributiontoabackgroundsetofcharactersfrom30alphabets,usingboththeimageandthestrokedata,andthisimagesetwasalsousedtopretrainthealternativedeeplearningmodels.Neithertheproductiondatanoranyalphabetsfromthissetareusedinthesubsequentevalu-ationtasks,whichprovidethemodelswithonlyrawimagesofnovelcharacters.
Handwrittencharactertypesyareanabstractschemaofparts,subparts,andrelations.Reflectingthecausalstructureofthehandwritingprocess,characterpartsSiarestrokesinitiatedbypres-singthependownandterminatedbyliftingitup(Fig.3A,iii),andsubpartssi1,...,siniaremoreprimitivemovementsseparatedbybriefpausesofFig.2.Simplevisualconceptsforcomparinghumanandmachinelearning.525(outof1623)characterconcepts,shownwithoneexampleeach.SCIENCEsciencemag.org
11DECEMBER2015?VOL350ISSUE6266
1333
RESEARCH|RESEARCHARTICLES
thepen(Fig.3A,ii).Toconstructanewcharactertype,firstthemodelsamplesthenumberofpartskandthenumberofsubpartsni,foreachparti=1,...,k,fromtheirempiricaldistributionsasmeasuredfromthebackgroundset.Second,atemplateforapartSiisconstructedbysamplingsubpartsfromasetofdiscreteprimitiveactionslearnedfromthebackgroundset(Fig.3A,i),suchthattheprobabilityofthenextactiondependsontheprevious.Third,partsarethengroundedasparameterizedcurves(splines)bysamplingthecontrolpointsandscaleparametersFig.3.Agenerativemodelofhandwrittencharacters.(A)Newtypesaregeneratedbychoosingprimitiveactions(colorcoded)fromalibrary(i),combiningthesesubparts(ii)tomakeparts(iii),andcombiningpartswithrelationstodefinesimpleprograms(iv).Newtokensaregeneratedbyrunningtheseprograms(v),whicharethenrenderedasrawdata(vi).(B)PseudocodeforgeneratingnewtypesyandnewtokenimagesI(m)form=1,...,M.Thefunctionf(·,·)transformsasubpartsequenceandstartlocationintoatrajectory.Training item with model’s ?ve best parsesHuman drawingsHuman parsesMachine parses-505-593-655-695-723Test items-646-1794-1276Fig.4.Inferringmotorprogramsfromimages.Partsaredistinguishedbycolor,withacoloreddotindicatingthebeginningofastrokeandanarrowheadindicatingtheend.(A)Thetoprowshowsthefivebestpro-gramsdiscoveredforanimagealongwiththeirlog-probabilityscores(Eq.1).Subpartbreaksareshownasblackdots.Forclassification,eachprogramwasrefittothreenewtestimages(leftinimagetriplets),andthebest-fittingparse(topright)isshownwithitsimagereconstruction(bottomright)andclassificationscore(logposteriorpredictiveprobability).Thecorrectlymatchingtestitemreceivesamuchhigherclassificationscoreandisalsomorecleanlyreconstructedbythebestprogramsinducedfromthetrainingitem.(B)Ninehumandrawingsofthreecharacters(left)areshownwiththeirgroundtruthparses(middle)andbestmodelparses(right).stroke order: 123451334
11DECEMBER2015?VOL350ISSUE6266sciencemag.orgSCIENCE
RESEARCH|RESEARCHARTICLES
foreachsubpart.Last,partsareroughlypositionedtobegineitherindependently,atthebeginning,attheend,oralongpreviousparts,asdefinedbyrelationRi(Fig.3A,iv).
Charactertokensq(m)areproducedbyexecut-ingthepartsandtherelationsandmodelinghowinkflowsfromthepentothepage.First,motornoiseisaddedtothecontrolpointsandthescaleofthesubpartstocreatetoken-levelstroketra-jectoriesS(m).Second,thetrajectory’sprecisestartlocationL(m)issampledfromtheschematicpro-videdbyitsrelationRitopreviousstrokes.Third,globaltransformationsaresampled,includinganaffinewarpA(m)andadaptivenoiseparame-tersthateaseprobabilisticinference(30).Last,abinaryimageI(m)iscreatedbyastochasticren-deringfunction,liningthestroketrajectorieswithgrayscaleinkandinterpretingthepixelvaluesasindependentBernoulliprobabilities.PosteriorinferencerequiressearchingthelargecombinatorialspaceofprogramsthatcouldhavegeneratedarawimageI(m).Ourstrategyusesfastbottom-upmethods(31)toproposearangeofcandidateparses.Themostpromisingcandidatesarerefinedbyusingcontinuousoptimizationandlocalsearch,formingadiscreteapproxima-tiontotheposteriordistributionP(y,q(m)|I(m))(sectionS3).Figure4Ashowsthesetofdiscov-eredprogramsforatrainingimageI(1)andhowtheyarerefittodifferenttestimagesI(2)tocomputeaclassificationscorelogP(I(2)|I(1))(thelogposteriorpredictiveprobability),wherehigherscoresindicatethattheyaremorelikelytobe-longtothesameclass.Ahighscoreisachievedwhenatleastonesetofpartsandrelationscansuccessfullyexplainboththetrainingandthetestimages,withoutviolatingthesoftconstraintsofthelearnedwithin-classvariabilitymodel.Figure4Bcomparesthemodel’sbest-scoringparseswiththeground-truthhumanparsesforseveralcharacters.Results
People,BPL,andalternativemodelswerecom-paredsidebysideonfiveconceptlearningtasksthatexaminedifferentformsofgeneralizationfromjustoneorafewexamples(exampletaskFig.5).AllbehavioralexperimentswererunthroughAmazon’sMechanicalTurk,andtheex-perimentalproceduresaredetailedinsectionS5.1212Human or Machine?12121212Fig.5.Generatingnewexemplars.Humansandmachinesweregivenanimageofanovelcharacter(top)andaskedtoproducenewexemplars.Thenine-charactergridsineachpairthatweregeneratedbyamachineare(byrow)1,2;2,1;1,1.SCIENCEsciencemag.org
ThemainresultsaresummarizedbyFig.6,andadditionallesionanalysesandcontrolsarere-portedinsectionS6.
One-shotclassificationwasevaluatedthroughaseriesofwithin-alphabetclassificationtasksfor10differentalphabets.AsillustratedinFig.1B,i,asingleimageofanewcharacterwaspresented,andparticipantsselectedanotherexampleofthatsamecharacterfromasetof20distinctchar-actersproducedbyatypicaldrawerofthatalpha-bet.PerformanceisshowninFig.6A,wherechanceis95%errors.Asabaseline,themodifiedHausdorffdistance(32)wascomputedbetweencenteredimages,producing38.8%errors.Peoplewereskilledone-shotlearners,achievinganaverageerrorrateof4.5%(N=40).BPLshowedasimilarerrorrateof3.3%,achievingbetterperformancethanadeepconvolutionalnetwork(convnet;13.5%errors)andtheHDmodel(34.8%)—eachadaptedfromdeeplearningmethodsthathaveperformedwellonarangeofcomputervisiontasks.AdeepSiameseconvolutionalnetworkoptimizedforthisone-shotlearningtaskachieved8.0%errors(33),stillabouttwiceashighashumansorourmodel.BPL’sadvantagepointstothebenefitsofmodelingtheunderlyingcausalprocessinlearningconcepts,astrategydifferentfromtheparticulardeeplearn-ingapproachesexaminedhere.BPL’sotherkeyingredientsalsomakepositivecontributions,asshownbyhighererrorratesforBPLlesionswithoutlearningtolearn(token-levelonly)orcompositionality(11.0%errorsand14.0%,respec-tively).Learningtolearnwasstudiedseparatelyatthetypeandtokenlevelbydisruptingthelearnedhyperparametersofthegenerativemodel.CompositionalitywasevaluatedbycomparingBPLtoamatchedmodelthatallowedjustonespline-basedstroke,resemblingearlieranalysis-by-synthesismodelsforhandwrittencharactersthatweresimilarlylimited(34,35).
Thehumancapacityforone-shotlearningismorethanjustclassification.Itcanincludeasuiteofabilities,suchasgeneratingnewexamplesofaconcept.Wecomparedthecreativeoutputspro-ducedbyhumansandmachinesthrough“visualTuringtests,”wherenaivehumanjudgestriedtoidentifythemachine,givenpairedexamplesofhumanandmachinebehavior.Inourmostbasictask,judgescomparedthedrawingsfromninehumansaskedtoproduceanewinstanceofaconceptgivenoneexamplewithninenewex-amplesdrawnbyBPL(Fig.5).Weevaluatedeachmodelbasedontheaccuracyofthejudges,whichwecalltheiridentification(ID)level:Idealmodelperformanceis50%IDlevel,indicatingthattheycannotdistinguishthemodel’sbehaviorfromhumans;worst-caseperformanceis100%.Eachjudge(N=147)completed49trialswithblockedfeedback,andjudgeswereanalyzedindividuallyandinaggregate.TheresultsareshowninFig.6B(newexemplars).Judgeshadonlya52%IDlevelonaveragefordiscriminatinghumanversusBPLbehavior.Asagroup,thisperformancewasbarelybetterthanchance[t(47)=2.03,P=0.048],andonly3of48judgeshadanIDlevelreliablyabovechance.Threelesionedmodelswereeval-uatedbydifferentgroupsofjudgesinseparate
11DECEMBER2015?VOL350ISSUE6266
1335
RESEARCH|RESEARCHARTICLES
conditionsofthevisualTuringtest,examiningthenecessityofkeymodelingredientsinBPL.Twolesions,learningtolearn(token-levelonly)andcompositionality,resultedinsignificantlyeasierTuringtesttasks(80%IDlevelwith17of19judgesabovechanceand65%with14of26,re-spectively),indicatingthatthistaskisanontrivialonetopassandthatthesetwoprincipleseachcontributetoBPL’shuman-likegenerativeprofi-ciency.Toevaluateparsingmoredirectly(Fig.4B),weranadynamicversionofthistaskwithadif-ferentsetofjudges(N=143),whereeachtrialshowedpairedmoviesofapersonandBPLdraw-ingthesamecharacter.BPLperformanceonthisvisualTuringtestwasnotperfect(59%averageIDlevel;newexemplars(dynamic)inFig.6B),althoughrandomizingthelearnedprioronstrokeorderanddirectionsignificantlyraisestheIDlevel(71%),showingtheimportanceofcapturingtherightcausaldynamicsforBPL.
Althoughlearningtolearnnewcharactersfrom30backgroundalphabetsprovedeffective,manyhumanlearnerswillhavemuchlessexperience:perhapsfamiliaritywithonlyoneorafewalpha-bets,alongwithrelateddrawingtasks.Toseehowthemodelsperformwithmorelimitedexpe-rience,weretrainedseveralofthembyusingtwodifferentsubsetsofonlyfivebackgroundalpha-bets.BPLachievedsimilarperformanceforone-shotclassificationaswith30alphabets(4.3%and4.0%errors,forthetwosets,respectively);incontrast,thedeepconvolutionalnetperformednotablyworsethanbefore(24.0%and22.3%errors).BPLperformanceonavisualTuringtestofexemplargeneration(N=59)wasalsosimilaronthefirstset[52%averageIDlevelthatwasnotsignificantlydifferentfromchancet(26)=1.04,P>0.05],withonly3of27judgesreliablyabovechance,althoughperformanceonthesecondsetwasslightlyworse[57%IDlevel;t(31)=4.35,P<0.001;7of32judgesreliablyabovechance].TheseresultssuggestthatalthoughlearningtolearnisimportantforBPL’ssuccess,themodel’sstructureallowsittotakenearlyfulladvantageofcomparativelylimitedbackgroundtraining.Thehumanproductivecapacitygoesbeyondgeneratingnewexamplesofagivenconcept:Peoplecanalsogeneratewholenewconcepts.Wetestedthisbyshowingafewexamplechar-actersfrom1of10foreignalphabetsandaskingparticipantstoquicklycreateanewcharacterthatappearstobelongtothesamealphabet(Fig.7A).TheBPLmodelcancapturethisbehaviorbyplacinganonparametricprioronthetypelevel,whichfavorsreusingstrokesinferredfromtheexamplecharacterstoproducestylisticallycon-sistentnewcharacters(sectionS7).HumanjudgescomparedpeopleversusBPLinavisualTuringtest(N=117),viewingaseriesofdisplaysintheformatofFig.7A,iandiii.Thejudgeshadonlya49%IDlevelonaverage[Fig.6B,newconcepts(fromtype)],whichisnotsignificantlydifferentfromchance[t(34)=0.45,P>0.05].Individually,only8of35judgeshadanIDlevelsignificantlyabovechance.Incontrast,amodelwithalesionto(type-level)learningtolearnwassuccessfullydetectedbyjudgeson69%oftrialsinaseparate
1336
11DECEMBER2015?VOL350ISSUE6266
conditionofthevisualTuringtest,andwassig-nificantlyeasiertodetectthanBPL(18of25judgesabovechance).FurthercomparisonsinsectionS6suggestedthatthemodel’sabilitytoproduceplausiblenovelcharacters,ratherthanstylisticconsistencyperse,wasthecrucialfactorforpassingthistest.Wealsofoundgreatervariationinindividualjudges’comparisonsofpeopleandtheBPLmodelonthistask,asreflectedintheirIDlevels:10of35judgeshadindividualIDlevelssignificantlybelowchance;incontrast,onlytwoparticipantshadbelow-chanceIDlevelsforBPLacrossalltheotherexperimentsshowninFig.6B.Last,judges(N=124)comparedpeopleandmodelsonanentirelyfree-formtaskofgenerat-ingnovelcharacterconcepts,unconstrainedbyaparticularalphabet(Fig.7B).SamplingfromthepriordistributiononcharactertypesP(y)inBPLledtoanaverageIDlevelof57%correctinavisualTuringtest(11of32judgesabovechance);withthenonparametricpriorthatreusesinferredpartsfrombackgroundcharacters,BPLachieveda51%IDlevel[Fig.7Bandnewconcepts(unconstrained)inFig.6B;IDlevelnotsignificantlydifferentfromchancet(24)=0.497,P>0.05;2of25judgesabovechance].Alesionanalysisrevealedthatbothcompositionality(68%and15of22)andlearningtolearn(64%and22of45)werecrucialinpassingthistest.Discussion
Despiteachangingartificialintelligenceland-scape,peopleremainfarbetterthanmachinesatBayesian Program Learning modelsPeopleBPLBPL Lesion (no learning-to-learn)BPL Lesion (no compositionality)learningnewconcepts:Theyrequirefewerexam-plesandusetheirconceptsinricherways.Ourworksuggeststhattheprinciplesofcomposi-tionality,causality,andlearningtolearnwillbecriticalinbuildingmachinesthatnarrowthisgap.Machinelearningandcomputervisionresearch-ersarebeginningtoexploremethodsbasedonsimpleprograminduction(36–41),andourresultsshowthatthisapproachcanperformone-shotlearninginclassificationtasksathuman-levelac-curacyandfoolmostjudgesinvisualTuringtestsofitsmorecreativeabilities.ForeachvisualTuringtest,fewerthan25%ofjudgesperformedsignif-icantlybetterthanchance.
Althoughsuccessfulonthesetasks,BPLstillseeslessstructureinvisualconceptsthanpeopledo.Itlacksexplicitknowledgeofparallellines,symmetry,optionalelementssuchascrossbarsin“7”s,andconnectionsbetweentheendsofstrokesandotherstrokes.Moreover,peopleusetheirconceptsforotherabilitiesthatwerenotstudiedhere,includingplanning(42),expla-nation(43),communication(44),andconceptualcombination(45).Probabilisticprogramscouldcapturethesericheraspectsofconceptlearninganduse,butonlywithmoreabstractandcomplexstructurethantheprogramsstudiedhere.More-sophisticatedprogramscouldalsobesuitableforlearningcompositional,causalrepresentationsofmanyconceptsbeyondsimpleperceptualcategories.Examplesincludeconceptsforphysicalartifacts,suchastools,vehicles,orfurniture,thatarewelldescribedbyparts,relations,andthefunctionsDeep Learning modelsDeep Siamese ConvnetDeep ConvnetHierarchical Deep(Note: only applicable to classi?cation tasks in panel A)85Identi?cation (ID) Level(% judges who correctly ID machine vs. human)3530Classi?cation error rate807570656055504540wednew exemplarsnew exemplars (dynamic)new concepts (from type)ewic)ingsew)nein natar npe)2520151050ionatOncsi?say) clot-wah0se-(2l rargamgtyneempingnsttindyntinom teaaaroxrr(rG enears nets (fne(uncweeeGplGepGts necemnepxeconcocFig.6.Humanandmachineperformancewascomparedon(A)one-shotclassificationand(B)fourgenerativetasks.Thecreativeoutputsforhumansandmodelswerecomparedbythepercentofhumanjudgestocorrectlyidentifythemachine.Idealperformanceis50%,wherethemachineisperfectlyconfusablewithhumansinthesetwo-alternativeforcedchoicetasks(pinkdottedline).Barsshowthemean±SEM[N=10alphabetsin(A)].Thenolearning-to-learnlesionisappliedatdifferentlevels(barslefttoright):(A)token;(B)token,strokeorder,type,andtype.sciencemag.orgSCIENCE