I am the only user to Impala at performance cluster. After running a few queries, I run into the following error:
[Cloudera][ImpalaODBC] (110) Error while executing a query in Impala: [HY000] : Memory limit exceeded The memory limit is set too low initialize the spilling operator. The minimum required memory to spill this operator is 272.00 MB.
The following link says that increasing memory across all nodes to the same high level will solve the problem:
http://community.cloudera.com/t5/Interactive-Short-cycle-SQL/impala-memory-limit-exceed/m-p/50220
we increase the memory allocated to Impala to 2GB across all nodes.
Impala out of memory in performance cluster
-
- Site Admin
- Posts: 185
- Joined: Wed Jul 16, 2014 9:22 pm
- Contact:
-
- Site Admin
- Posts: 185
- Joined: Wed Jul 16, 2014 9:22 pm
- Contact:
Re: Impala out of memory in performance cluster
just increase mem_limit parameter if you have enough memory. You can also calculate the memory limit based on below formula.
Cluster Total Memory Requirement = Size of the smaller table * selectivity factor from the predicate *projection factor * compression ratio
= 100TB * 10% * 5/200 * 3
= 0.75TB
= 750GB
selectivity factor from the predicate=10%(number of rows user will select) it may be varies from 1 to 70% in our case
projection factor =5/200 (5 column out of 200 total column) it may 100 column out of 200 column
compression ratio=3 (snappy compression ratio) it is correct in our case
-
- Site Admin
- Posts: 185
- Joined: Wed Jul 16, 2014 9:22 pm
- Contact:
Re: Impala out of memory in performance cluster
Memory - 128 GB or more recommended, ideally 256 GB or more. If the intermediate results during query processing on a particular node exceed the amount of memory available to Impala on that node, the query is cancelled. Note that because the work is parallelized, and intermediate results for aggregate queries are typically smaller than the original data, Impala can query and join tables that are much larger than the memory available on an individual node.
Storage - DataNodes with 12 or more disks each. I/O speeds are often the limiting factor for disk performance with Impala. Ensure that you have sufficient disk space to store the data Impala will be querying.
Storage - DataNodes with 12 or more disks each. I/O speeds are often the limiting factor for disk performance with Impala. Ensure that you have sufficient disk space to store the data Impala will be querying.
-
- Site Admin
- Posts: 185
- Joined: Wed Jul 16, 2014 9:22 pm
- Contact:
Re: Impala out of memory in performance cluster
most of the time "Memory Limit Exceeded" errors can be due to table/column stats not being collected, leading to the wrong join distribution. Once stats are gathered, you may find queries take significantly less memory to execute.
Many workloads may run perfectly fine with significantly less than the recommended memory once stats are collected.
Many workloads may run perfectly fine with significantly less than the recommended memory once stats are collected.
Return to “Hadoop and Big Data”
Who is online
Users browsing this forum: No registered users and 1 guest