Some days ago I quickly went through a post by Sergio Veiga, stating an interesting difference in speed between two Erlang functions which basically have the same functionality of retrieving a value from a list: lists:keyfind/3 and proplists:get_value/2.
Intrigued, I decided to perform additional testing, by performing a small benchmark of a random key retrieval on lists of different sizes, using those two different functions.
Here’s the code I used:
-module(pvsl).
-define(LIST_SIZES, [10000, 100000, 1000000]).
-define(RETRIES, 1000).
-compile(export_all).
start() ->
% test for different list sizes
lists:foreach(fun(N) -> test_list(N) end, ?LIST_SIZES).
test_list(ListSize) ->
% generate a list of size ListSize of {Key, Val} entries
KeyList = [{K, K} || K <- lists:seq(1, ListSize)],
% test this list against both functions
lists:foreach(fun(Type) -> get_val(Type, now(), KeyList, ListSize, ?RETRIES) end,
[proplists, lists]).
% test getting values, compute necessary time and output print results
get_val(Type, Start, _KeyList, ListSize, 0) ->
T = timer:now_diff(now(), Start),
io:format("computed ~p random key searches on a ~p-sized list in ~p ms using ~p~n",
[?RETRIES, ListSize, T/1000, Type]);
get_val(proplists, Start, KeyList, ListSize, Tries) ->
proplists:get_value(random:uniform(ListSize), KeyList),
get_val(proplists, Start, KeyList, ListSize, Tries - 1);
get_val(lists, Start, KeyList, ListSize, Tries) ->
lists:keyfind(random:uniform(ListSize), 1, KeyList),
get_val(lists, Start, KeyList, ListSize, Tries - 1).
I ran this test on my MacBook Pro, Intel Core i5 2.4GHz with 4GB Memory, and Erlang R13B04, with Kernel Polling enabled. These are the results.
roberto$ erl +K true +P 1000000
Erlang R13B04 (erts-5.7.5) [source] [smp:4:4] [rq:4] [async-threads:0] [hipe] [kernel-poll:true]
Eshell V5.7.5 (abort with ^G)
1> c(pvsl).
{ok,pvsl}
2> pvsl:start().
computed 1000 random key searches on a 10000-sized list in 323.373 ms using proplists
computed 1000 random key searches on a 10000-sized list in 12.897 ms using lists
computed 1000 random key searches on a 100000-sized list in 3273.973 ms using proplists
computed 1000 random key searches on a 100000-sized list in 130.592 ms using lists
computed 1000 random key searches on a 1000000-sized list in 34131.905 ms using proplists
computed 1000 random key searches on a 1000000-sized list in 2050.627 ms using lists
ok
3>
Of course, this is far from being a complete benchmark, however it does show that proplists:get_value/2 is considerably slower [up to 25 times on smaller lists on this benchmark run] than lists:keyfind.
Not sure why this is happening, though I would recommend using lists:keyfind/3 on all speed-critical parts of your code.
thank you adam for this feedback.
yes, that is indeed true. what may be an additional reason, is that proplists:get_value performs additional tests on list size and key being an atom, all things which do slow down code.
Hi Roberto, keyfind and keysearch are BIFs, while proplists:get_value is not. We recently switched all internal KV lookups in CouchDB to use use keysearch (for R12 compatibility).