Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can not load 'zebrafish-optomotor-response' data on EC2 #21

Closed
freemanwyz opened this issue Sep 9, 2014 · 5 comments
Closed

Can not load 'zebrafish-optomotor-response' data on EC2 #21

freemanwyz opened this issue Sep 9, 2014 · 5 comments

Comments

@freemanwyz
Copy link

Hi
I tried to run the following sample on Amazon cloud service EC2 using IPython notebook: http://nbviewer.ipython.org/url/research.janelia.org/zebrafish/notebooks/optomotor-response-PCA.ipynb

from thunder.utils import save, pack, subset
from thunder.regression import RegressionModel
from thunder.factorization import PCA
from thunder.viz import Colorize
import seaborn as sns
# load the data and the model parameters (in this case, a design matrix to perform trial-averaging)
data, params = tsc.loadExampleEC2('zebrafish-optomotor-response')

But I got the following error messages when it read the data 'zebrafish-optomotor-response'.

Py4JJavaError                             Traceback (most recent call last)
<ipython-input-18-8420625e6184> in <
module>()
      6 # load the data and the model parameters (in this case, a design matrix to perform trial-averaging)
      7 #data, designmatrix = tsc.loadExampleEC2('zebrafish-optomotor-response')
----> 8 data, params = tsc.loadExampleEC2('zebrafish-optomotor-response')

/root/thunder/python/thunder/utils/context.pyc in loadExampleEC2(self, dataset)
    200             data = self.loadText("s3n://" + path + 'data/dat_plane*.txt', filter='dff', minPartitions=1000)
    201             paramfile = self._sc.textFile("s3n://" + path + "params.json")
--> 202             params = json.loads(paramfile.first())
    203             modelfile = asarray(params['trials'])
    204             return data, modelfile

/root/spark/python/pyspark/rdd.pyc in first(self)
    963         2
    964         """
--> 965         return self.take(1)[0]
    966 
    967     def saveAsPickleFile(self, path, batchSize=10):

/root/spark/python/pyspark/rdd.pyc in take(self, num)
    923         """
    924         items = []
--> 925         totalParts = self._jrdd.partitions().size()
    926         partsScanned = 0
    927 

/root/spark/python/lib/py4j-0.8.1-src.zip/py4j/java_gateway.py in __call__(self, *args)
    535         answer = self.gateway_client.send_command(command)
    536         return_value = get_return_value(answer, self.gateway_client,
--> 537                 self.target_id, self.name)
    538 
    539         for temp_arg in temp_args:

/root/spark/python/lib/py4j-0.8.1-src.zip/py4j/protocol.py in get_return_value(answer, gateway_client, target_id, name)
    298                 raise Py4JJavaError(
    299                     'An error occurred while calling {0}{1}{2}.\n'.
--> 300                     format(target_id, '.', name), value)
    301             else:
    302                 raise Py4JError(

Py4JJavaError: An error occurred while calling o62.partitions.
: org.apache.hadoop.fs.s3.S3Exception: org.jets3t.service.S3ServiceException: S3 HEAD request failed for '/optomotor-response%2F1%2Fparams.json' - ResponseCode=403, ResponseMessage=Forbidden
    at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.retrieveMetadata(Jets3tNativeFileSystemStore.java:122)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
    at org.apache.hadoop.fs.s3native.$Proxy8.retrieveMetadata(Unknown Source)
    at org.apache.hadoop.fs.s3native.NativeS3FileSystem.getFileStatus(NativeS3FileSystem.java:326)
    at org.apache.hadoop.fs.FileSystem.getFileStatus(FileSystem.java:1337)
    at org.apache.hadoop.fs.FileSystem.globStatusInternal(FileSystem.java:1045)
    at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:987)
    at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:177)
    at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:208)
    at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:176)
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:203)
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:201)
    at scala.Option.getOrElse(Option.scala:120)
    at org.apache.spark.rdd.RDD.partitions(RDD.scala:201)
    at org.apache.spark.rdd.MappedRDD.getPartitions(MappedRDD.scala:28)
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:203)
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:201)
    at scala.Option.getOrElse(Option.scala:120)
    at org.apache.spark.rdd.RDD.partitions(RDD.scala:201)
    at org.apache.spark.api.java.JavaRDDLike$class.partitions(JavaRDDLike.scala:50)
    at org.apache.spark.api.java.JavaRDD.partitions(JavaRDD.scala:32)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
    at py4j.Gateway.invoke(Gateway.java:259)
    at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
    at py4j.commands.CallCommand.execute(CallCommand.java:79)
    at py4j.GatewayConnection.run(GatewayConnection.java:207)
    at java.lang.Thread.run(Thread.java:745)
Caused by: org.jets3t.service.S3ServiceException: S3 HEAD request failed for '/optomotor-response%2F1%2Fparams.json' - ResponseCode=403, ResponseMessage=Forbidden
    at org.jets3t.service.impl.rest.httpclient.RestS3Service.performRequest(RestS3Service.java:477)
    at org.jets3t.service.impl.rest.httpclient.RestS3Service.performRestHead(RestS3Service.java:718)
    at org.jets3t.service.impl.rest.httpclient.RestS3Service.getObjectImpl(RestS3Service.java:1599)
    at org.jets3t.service.impl.rest.httpclient.RestS3Service.getObjectDetailsImpl(RestS3Service.java:1535)
    at org.jets3t.service.S3Service.getObjectDetails(S3Service.java:1987)
    at org.jets3t.service.S3Service.getObjectDetails(S3Service.java:1332)
    at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.retrieveMetadata(Jets3tNativeFileSystemStore.java:111)
    ... 36 more
Caused by: org.jets3t.service.impl.rest.HttpException
    at org.jets3t.service.impl.rest.httpclient.RestS3Service.performRequest(RestS3Service.java:475)
    ... 42 more

Regards,
Yizhi Wang

@freeman-lab
Copy link
Member

Hi Yizhi, sorry about that, we were testing some changes to the permissions on these files on S3, it should now be working correctly (I just tested access from a temporary account).

@freemanwyz
Copy link
Author

Thank you. Now it can read the data and perform PCA, but I got the following error in packing the scores

# get the scores as images
imgs = pack(pca.scores)
---------------------------------------------------------------------------
Py4JJavaError                             Traceback (most recent call last)
<ipython-input-7-7fe63671277f> in <module>()
      1 # get the scores as images
----> 2 imgs = pack(pca.scores)
      3 # convert first two PCs to a polar representation
      4 maps = Colorize("polar", scale=1000).image(imgs)

/root/thunder/python/thunder/utils/save.pyc in pack(data, ind, dims, sorting, axis)
    129 
    130     if ind is None:
--> 131         result = data.map(lambda (_, v): float16(v)).collect()
    132         nout = size(result[0])
    133     else:

/root/spark/python/pyspark/rdd.pyc in collect(self)
    647         """
    648         with _JavaStackTrace(self.context) as st:
--> 649           bytesInJava = self._jrdd.collect().iterator()
    650         return list(self._collect_iterator_through_file(bytesInJava))
    651 

/root/spark/python/lib/py4j-0.8.1-src.zip/py4j/java_gateway.py in __call__(self, *args)
    535         answer = self.gateway_client.send_command(command)
    536         return_value = get_return_value(answer, self.gateway_client,
--> 537                 self.target_id, self.name)
    538 
    539         for temp_arg in temp_args:

/root/spark/python/lib/py4j-0.8.1-src.zip/py4j/protocol.py in get_return_value(answer, gateway_client, target_id, name)
    298                 raise Py4JJavaError(
    299                     'An error occurred while calling {0}{1}{2}.\n'.
--> 300                     format(target_id, '.', name), value)
    301             else:
    302                 raise Py4JError(

Py4JJavaError: An error occurred while calling o111.collect.
: org.apache.spark.SparkException: Job cancelled because SparkContext was shut down
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$cleanUpAfterSchedulerStop$1.apply(DAGScheduler.scala:637)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$cleanUpAfterSchedulerStop$1.apply(DAGScheduler.scala:636)
    at scala.collection.mutable.HashSet.foreach(HashSet.scala:79)
    at org.apache.spark.scheduler.DAGScheduler.cleanUpAfterSchedulerStop(DAGScheduler.scala:636)
    at org.apache.spark.scheduler.DAGSchedulerEventProcessActor.postStop(DAGScheduler.scala:1234)
    at akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:201)
    at akka.actor.dungeon.FaultHandling$class.terminate(FaultHandling.scala:163)
    at akka.actor.ActorCell.terminate(ActorCell.scala:338)
    at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:431)
    at akka.actor.ActorCell.systemInvoke(ActorCell.scala:447)
    at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:262)
    at akka.dispatch.Mailbox.run(Mailbox.scala:218)
    at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
    at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
    at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
    at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
    at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

@freeman-lab
Copy link
Member

@freemanwyz I just pushed a new release that should fix this. It appeared to be an error related to memory configuration in the custom AMI we were trying. If you install the new version of thunder (best to call pip uninstall thunder-python then pip install thunder-python), and launch a new EC2 cluster, it should now work. I just ran that demo analysis all the way through fine, using a cluster with 19 nodes.

@freemanwyz
Copy link
Author

I followed your steps to reinstalled thunder and created a new cluster. But when I tried to run thunder, it said:

root@ip-172-31-9-124 ~]$ thunder
sh: /root/spark/bin/pyspark: No such file or directory

I searche the screen output when thunder-ec2 setup up the cluster, it contains the following error

~/spark-ec2
Initializing spark
~ ~/spark-ec2
spark/init.sh: line 92: syntax error near unexpected token `)'
spark/init.sh: line 92: `    1.1.0)'
Initializing shark
~ ~/spark-ec2 ~/spark-ec2
ERROR: Unknown Shark version
~/spark-ec2 ~/spark-ec2

Then I opened the init.sh in /root/spark-ec2/spark/ and maybe line 91 is wrong with redundant *)

@freeman-lab
Copy link
Member

@freemanwyz This was a typo in the deployment scripts that we call during setup, and appears to have just been fixed (see mesos/spark-ec2@2b35fc1) if you want to try again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants