欢迎来到Doc100.Net免费学习资源知识分享平台!
您的位置:首页 > 程序异常 >

nutch鐮旂┒鈥旈亣鍒扮殑閿欒鍜岃В鍐冲姙娉

更新时间: 2014-01-05 02:57:10 责任编辑: Author_N1

 

nutch鐮旂┒鈥旈亣鍒扮殑閿欒鍜岃В鍐冲姙娉?
1銆乧ygwin 杩愯 bin/nutch crawl urls -dir crawled -depth 3 -topN 50 >&crawl.log

銆€銆€銆€銆€鍑虹幇涓嬮潰闂:bin/nutch: line 251: exec: C:\Program: not found銆?

瑙e喅锛氫粠鏂板畬鏁寸殑瀹夎cygwin,涓嶈鎸夌収缃戜笂璇寸殑鍙畨瑁呭叾涓渶瑕佺殑閭e嚑涓寘鍐呭銆?


2銆佸彸涓婅閫夐」鍗′贡鐮侀棶棰?

鍙充笂瑙掆€滅畝浠嬧€濄€佲€滃父瑙侀棶棰樷€濆湪鎼滅储涓荤晫闈笉涔卞悧锛屼絾鎼滅储鏃朵贡鐮佺殑闂銆?

淇敼 Tomcat 7.0/webapps/nutch-1.2/zh/header.html 鐨勭紪鐮佷负GBK

<?xml version="1.0" encoding="GBK"?>
娉ㄦ剰锛氬湪<?xml version="1.0" encoding="GBK"?>鍚庡湪娣诲姞<META http-equiv="Content-Type" content="text/html; charset=UTF-8">


3銆丯utch1.2 娣诲姞IKAnalyzer涓枃鍒嗚瘝锛堝弬鑰冭繖绡囨枃绔狅級
鎸夌収杩欑瘒鏂囩珷淇敼婧愮爜鐨勬椂鍊欎細鍑虹幇浠ヤ笅閿欒锛?
LinkDb: finished at 2011-07-14 11:34:06, elapsed: 00:00:03
Indexer: starting at 2011-07-14 11:34:06
Exception in thread "main" java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1252)
at org.apache.nutch.indexer.Indexer.index(Indexer.java:76)
at org.apache.nutch.crawl.Crawl.main(Crawl.java:167)
瑙e喅锛氳繖鏄湪鐖彇缃戠粶鏁版嵁鐨勬椂鍊欙紝鍙兘鏄繕璁版妸IKAnalyzer3.2.8.jar鏀惧埌nutch/lib鐩綍涓嬩簡銆?

4銆佷慨鏀规簮鐮佸悗锛屽湪姝ゆ悳绱細鍑虹幇绌虹櫧椤甸棶棰橈紙杩欎釜鑺辫垂鎴戜笁澶╂椂闂村晩锛夊嚭鐜扮殑閿欒鏄細Caused by: java.lang.IllegalArgumentException: This AttributeSource does not have the attribute 'org.apache.lucene.analysis.tokenattributes.PositionIncrementAttribute'.

at org.apache.lucene.util.AttributeSource.getAttribute(AttributeSource.java:277)

at org.apache.nutch.summary.basic.BasicSummarizer.getTokens(BasicSummarizer.java:362)

at org.apache.nutch.summary.basic.BasicSummarizer.getSummary(BasicSummarizer.java:134)

鍑虹幇鍘熷洜鏄細

鍓嶉潰鎴戜滑淇敼杩嘚utchDocumentAnalyzer绫伙紝浣跨敤浜咺KAnalyezer绫汇€傛鏃跺氨闇€瑕佷慨鏀逛腑鏂囧垎璇嶇殑寮€婧怚KAnalyezer鐨勬簮鐮佷簡銆?
鑰屽湪IKAnalyezer涓苟娌℃湁娣诲姞 PositionIncrementAttribute灞炴€э紝鎵€浠ュ嚭鐜板紓甯革紝浜庢槸淇敼IKAnalyezer鐨勬簮浠g爜IKTokenizer.java鏂囦欢鍦ㄦ坊鍔?

//寮曞叆鍖呯殑鍦版柟

import org.apache.lucene.analysis.tokenattributes.PositionIncrementAttribute;

//鍙橀噺澹版槑鐨勫湴鏂?

private PositionIncrementAttribute posIncrAtt;

//public IKTokenizer(Reader in , boolean isMaxWordLength)鏂规硶鍐呮坊鍔?

posIncrAtt = addAttribute(PositionIncrementAttribute.class);
鐢╝nt鍛戒护閲嶆柊缂栬瘧IKAnaalyezer锛岀敓鎴怚KAnalyzer3.2.8.jar锛堟鏃跺ソ鍍忛渶瑕佽嚜宸卞啓ant鐨刡uild.xml鏂囦欢锛屾垜鐢╡clipse鐩存帴瀵煎嚭jar鏂囦欢鐨勶級

鏇挎崲nutch涓嬬殑瀵瑰簲鏂囦欢锛岄噸鏂扮紪璇憂utch銆?


5銆佺鍥涙瑙e喅涔嬪悗锛岃繕鏄┖鐧介〉锛堣繖涓姳璐规垜涓夊ぉ鏃堕棿鍟婏級

鏌ョ湅tomcat涓嬬殑log鏂囦欢鏃讹紝浼氭湁浠ヤ笅寮傚父淇℃伅锛?

ava.lang.IllegalArgumentException: This AttributeSource does not have the attribute 'org.apache.lucene.analysis.tokenattributes.TypeAttribute'.

at org.apache.lucene.util.AttributeSource.getAttribute(AttributeSource.java:277)

at org.apache.nutch.summary.basic.BasicSummarizer.getTokens(BasicSummarizer.java:364)

at org.apache.nutch.summary.basic.BasicSummarizer.getSummary(BasicSummarizer.java:135)

at org.apache.nutch.searcher.FetchedSegments.getSummary(FetchedSegments.java:263)

at org.apache.nutch.searcher.FetchedSegments$SummaryTask.call(FetchedSegments.java:63)

at org.apache.nutch.searcher.FetchedSegments$SummaryTask.call(FetchedSegments.java:53)

at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)

at java.util.concurrent.FutureTask.run(FutureTask.java:138)

at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)

at java.lang.Thread.run(Thread.java:662)

闂鍑虹幇鐨勫師鍥犲拰闂4绫讳技銆傞渶瑕佺殑鍚屾牱鐨勫湴鏂规坊鍔狅細

private TypeAttribute typeAtt;

typeAtt = addAttribute(TypeAttribute.class);

鐒跺悗杩樻槸浠庢柊缂栬瘧鐢熸垚IKAnalyzer3.2.8.jar鏂囦欢锛?

鏈€鍚庝粠鏂癮nt,鐢熸垚nutch-1.2.job锛宯utch-1.2.war锛宯utch-1.2.jar銆傛妸鐖幓鏁版嵁鍜屾悳绱㈤儴鍒嗙殑閮芥浛鎹㈡垚鏈€鏂扮殑鏂囦欢锛屽埆蹇樿IKAnalyzer3.2.8.jar鍝︺€?
上一篇:上一篇
下一篇:下一篇

 

随机推荐程序问答结果

 

 

如对文章有任何疑问请提交到问题反馈,或者您对内容不满意,请您反馈给我们DOC100.NET论坛发贴求解。
DOC100.NET资源网,机器学习分类整理更新日期::2014-01-05 02:57:10
如需转载,请注明文章出处和来源网址:http://www.doc100.net/bugs/t/16579/
本文WWW.DOC100.NET DOC100.NET版权所有。