워커노드에서 마스터 노드로 못붙는 경우가 있다.

여러가지 경우가 있는데, 이번 케이스는 마스터 노드에 tajo-site.xml을 정의 안해서 그런문제였다.

내가 잘못한건 마스터 노드에서는 tajo-site.xml 파일이 필요없을줄 알고, 워커노드에만 tajo-site.xml을 세팅했는데

마스터 노드에서 해당 파일이 없으면, 실행할때 localhost (127.0.0.1) 로 실행이 되면서 외부서버에서 접근이 안되는 문제가 발생된것이다. 

(django나 flask에서 localhost 로 띄우면 외부에서 접근이 안되서 0.0.0.0 인가?로 띄워야 했던것 같은느낌)


즉, tajo-site.xml 파일은 모든 노드에 존재해야한다.

아래와 같이 MASTER 노드에 대한 호스트나 IP를 정의해야 마스터 노드로 붙을수 있다.


...


<!-- System Settings -->

 <property>

  <name>tajo.rootdir</name>

  <value>hdfs://MASTER_HOST:8020/tajo</value>

  <description>Base directory including system directories.</description>

</property>


<property>

  <name>tajo.master.umbilical-rpc.address</name>

  <value>MASTER_HOST:26001</value>

  <description>TajoMaster binding address between master and workers.</description>

</property>


<property>

  <name>tajo.master.client-rpc.address</name>

  <value>MASTER_HOST:26002</value>

  <description>TajoMaster binding address between master and clients.</description>

</property>


<property>

  <name>tajo.resource-tracker.rpc.address</name>

  <value>MASTER_HOST:26003</value>

  <description>TajoMaster binding address between master and workers.</description>

</property>


<property>

  <name>tajo.catalog.client-rpc.address</name>

  <value>MASTER_HOST:26005</value>

  <description>CatalogServer binding address between catalog server and workers.</description>

</property>


...



참고로 가장 간단히 체크하는 방법은 

http://MASTER_HOST:26080  형태로 관리페이지에 접근해보면, 아래와 같이 TajoMaster가 localhost로 되어 있으면 아 내가 마스터 노드에 대한 ip나 호스트 설정을 안해줬구나 생각하면 된다. 저 붉은색에 localhost가 아닌 실제 호스트명이나 ip가 보여야 한다.





* woker서버에서 로그


2015-08-11 10:08:02,479 ERROR org.apache.tajo.worker.WorkerHeartbeatService: Max retry count has been exceeded. attempts=3 caused by: java.net.ConnectException: 연결이 거부됨: MASTER_HOST/10.0.0.1:26003
io.netty.channel.ConnectTimeoutException: Max retry count has been exceeded. attempts=3 caused by: java.net.ConnectException: 연결이 거부됨: MASTER_HOST/10.0.0.1:26003
        at org.apache.tajo.rpc.NettyClientBase.doReconnect(NettyClientBase.java:139)
        at org.apache.tajo.rpc.NettyClientBase.connect(NettyClientBase.java:118)
        at org.apache.tajo.rpc.RpcClientManager.getClient(RpcClientManager.java:96)
        at org.apache.tajo.worker.WorkerHeartbeatService$WorkerHeartbeatThread.run(WorkerHeartbeatService.java:187)
2015-08-11 10:08:12,480 WARN org.apache.tajo.rpc.NettyClientBase: 연결이 거부됨: MASTER_HOST/10.0.0.1:26003 Try to reconnect
2015-08-11 10:08:13,482 WARN org.apache.tajo.rpc.NettyClientBase: 연결이 거부됨: MASTER_HOST/10.0.0.1:26003 Try to reconnect
2015-08-11 10:08:14,483 WARN org.apache.tajo.rpc.NettyClientBase: 연결이 거부됨: MASTER_HOST/10.0.0.1:26003 Try to reconnect


* master 서버의 로그 (tajo-site.xml 파일이 없으면 저렇게 localhost , 127.0.0.1 로 남는다. 이러면 외부서버에서 마스터로 못붙는다)

2015-08-11 10:36:54,182 INFO org.apache.tajo.master.rm.TajoWorkerResourceManager: WorkerResourceAllocationThread start

2015-08-11 10:36:54,184 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.tajo.master.rm.WorkerEventType for class org.apache.tajo.master.rm.TajoWorkerResourceManager$WorkerEventDispatcher

2015-08-11 10:36:54,241 INFO org.apache.tajo.rpc.RpcChannelFactory: Create TajoResourceTrackerProtocol-1 ServerSocketChannelFactory. Worker:3

2015-08-11 10:36:54,311 INFO org.apache.tajo.rpc.NettyServerBase: Rpc (TajoResourceTrackerProtocol) listens on /127.0.0.1:26003

2015-08-11 10:36:54,311 INFO org.apache.tajo.master.rm.TajoResourceTracker: TajoResourceTracker starts up (localhost/127.0.0.1:26003)

2015-08-11 10:36:54,321 INFO org.apache.tajo.catalog.CatalogServer: Catalog Store Class: org.apache.tajo.catalog.store.HCatalogStore

2015-08-11 10:36:54,665 INFO hive.metastore: Trying to connect to metastore with URI thrift://마스터:9083

2015-08-11 10:36:54,685 INFO hive.metastore: Opened a connection to metastore, current connections: 1

2015-08-11 10:36:54,699 INFO hive.metastore: Connected to metastore.

2015-08-11 10:36:54,786 INFO org.apache.tajo.catalog.store.HCatalogStoreClientPool: MetaStoreClient created (size = 1)

2015-08-11 10:36:54,787 INFO hive.metastore: Trying to connect to metastore with URI thrift://마스터:9083

2015-08-11 10:36:54,787 INFO hive.metastore: Opened a connection to metastore, current connections: 2

2015-08-11 10:36:54,788 INFO hive.metastore: Connected to metastore.

2015-08-11 10:36:54,788 INFO org.apache.tajo.catalog.store.HCatalogStoreClientPool: MetaStoreClient created (size = 3)

2015-08-11 10:36:54,822 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.tajo.querymaster.QueryJobEvent$Type for class org.apache.tajo.master.QueryManager$QueryJobManagerEventHandler

2015-08-11 10:36:54,823 INFO org.apache.tajo.master.TajoMaster: Tajo Master is initialized.

2015-08-11 10:36:54,823 INFO org.apache.tajo.master.TajoMaster: TajoMaster is starting up

2015-08-11 10:36:54,828 INFO org.apache.tajo.master.TajoMaster: Default tablespace (default) is already prepared.

2015-08-11 10:36:54,840 INFO org.apache.tajo.master.TajoMaster: Default database (default) is already prepared.

2015-08-11 10:36:54,857 INFO org.apache.tajo.rpc.RpcChannelFactory: Create CatalogProtocol-2 ServerSocketChannelFactory. Worker:48

2015-08-11 10:36:54,867 INFO org.apache.tajo.rpc.NettyServerBase: Rpc (CatalogProtocol) listens on /127.0.0.1:26005

2015-08-11 10:36:54,867 INFO org.apache.tajo.catalog.CatalogServer: Catalog Server startup (127.0.0.1:26005)

2015-08-11 10:36:55,518 INFO org.apache.tajo.rpc.RpcChannelFactory: Create TajoMasterClientProtocol-3 ServerSocketChannelFactory. Worker:24

2015-08-11 10:36:55,523 INFO org.apache.tajo.rpc.NettyServerBase: Rpc (TajoMasterClientProtocol) listens on /127.0.0.1:26002

2015-08-11 10:36:55,523 INFO org.apache.tajo.master.TajoMasterClientService: Instantiated TajoMasterClientService at localhost/127.0.0.1:26002

2015-08-11 10:36:55,528 INFO org.apache.tajo.rpc.RpcChannelFactory: Create QueryCoordinatorProtocol-4 ServerSocketChannelFactory. Worker:48

2015-08-11 10:36:55,536 INFO org.apache.tajo.rpc.NettyServerBase: Rpc (QueryCoordinatorProtocol) listens on /127.0.0.1:26001

2015-08-11 10:36:55,536 INFO org.apache.tajo.master.QueryCoordinatorService: Instantiated TajoMasterService at localhost/127.0.0.1:26001

2015-08-11 10:36:55,685 INFO org.apache.tajo.util.history.HistoryWriter: HistoryWriter_127.0.0.1_26001 started.

2015-08-11 10:36:55,685 INFO org.apache.tajo.util.history.HistoryCleaner: History cleaner started: expiry day=7

2015-08-11 10:40:36,896 ERROR org.apache.tajo.master.TajoMaster: RECEIVED SIGNAL 15: SIGTERM




연결잘안될때 추가로 참고할 글

[TAJO] work에서 26003 포트 연결실패


+ Recent posts