They aren't easily comparable because they work in very different ways - also note that Kafka number is for a cluster running on three machines: https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines
https://redis.io/topics/benchmarks should be useful here though. They get 770,000 LPUSH/LPOP per second on a single node Intel(R) Xeon(R) CPU E5520 @ 2.27GHz using Redis pipelining.
There is no client that works in multiple languages...
Use this, pick one with a gold star.
While the clients may differ for each language, Redis commands are the same for each, all clients will implement slightly different ways to use the same commands (set, get, del, expire, etc)
P.S. I work for Redis 😁
Depending on the settings, redis can periodically dump the whole in-memory dataset in the RDB format, or it can log every command modyfing the dataset into an append-only-file (AOF), which is then replayed against the server to reconstruct the state. You can also use both RDB and AOF persistence at the same time.
If you need more info, check the persistence docs.
Documentation is pretty explicit:
For the del command: Integer reply: the number of keys that were removed.
Source: https://redis.io/commands/del
For the set command: Simple string reply: OK if SET was executed correctly. Null reply: a Null Bulk Reply is returned if the SET operation was not performed because the user specified the NX or XX option but the condition was not met.
Source: https://redis.io/commands/set
Hello and please forgive my educational tone.
TL;DR There are no free lunches so every work is "bad" for the CPU, but on the other hand an idle CPU is just a waste of resources. That said, the CPU will probably be ok.
First, you should read more about Redis' expiration - the truth is out there.
If you've read the above carefully, you see that expiry's CPU usage is managed - keys expire on access passively or actively 10 times every second (the hz
configuration directive). This ensures that even if the entire keyspace expires at the same microsecond (a gnab gib of sorts), the server will still be responsive.
> they all will be ticking every second right?
I hope that by now you understand that there are no ticking keys - that would be an extremely inefficient way to manage expiration.
Note: actually, the real price (CPU-wise) of expiration is the deletion of the value. Bigger values (think a List w/ 10K elements) require more work to free (i.e. dealocate
]. A major improvement in v4 is "lazy deletion" (see the <code>UNLINK</code> command for details), that can be used in expiration by setting the lazyfree-lazy-expire
configuration directive to "yes".
Ubuntu uses systemd for start up scripts.
https://www.digitalocean.com/community/tutorials/how-to-install-and-configure-redis-on-ubuntu-16-04
I recommend following that guide for your set up (: might help a little more. Hope it works!
Run three copies of redis on different ports, using three configs, three rdb files, etc. Redis is inherently a single-port service using a single-threaded event loop. You're probably thinking about redis cluster, which can shard data across multiple redis instances, but you'd still be running multiple copies of redis. Follow these instructions to try that: https://redis.io/topics/cluster-tutorial
I didn't see the existing TLS PR, and I'm not finding it now. Do you have a link?
As for why fork and not PR, Salvatore already closed the Transactions PR and said he didn't want Redis to go in that direction. And when searching about SSL/TLS in Redis itself, I found this: https://redis.io/topics/encryption , read the implementation of spiped (it uses fixed 1k block sizes), then realized that SSL/TLS is the right answer in this situation.
Could transactions be a module? I was about halfway through the cluster transaction bits as a module when I hit a collection of "oh wait, I can't even call this entire class of things unless I create new module wrappers for both directions" problems. Then I just added a new .c file, new .h, did the right includes, a make clean && make, and my life was 10x better.
Edit: Also, this just includes redis-benchmark, redis with SSL/TLS, etc., is still a couple weeks out. I need to get redis-cli and redis-sentinel speaking SSL/TLS.
The block of commands between MULTI and EXEC are executed by the Redis server in one atomic action. Redis does not execute commands from other clients while the MULTI / EXEC block is executing. You can put a read command and a write command in the block, but unfortunately you can't make a decision in between the commands. Your client code receives the results of the commands in the block after the whole block has been executed.
You can achieve what you're looking for in a Lua script. The script constitutes a single "command" from the Resis server perspective, so no other clients can alter the data while the Lua script is executing. Since it's a script, it can read keys, make decisions, and write keys as it executes.
https://redis.io/commands/eval is the starting point for investigating the use of Lua scripts with Redis server. There are also some good tutorials published on the Web.
One thing to be aware of is that a complex Lua script will make the other clients wait longer to have Redis process their commands. I.e., a Lua script that does too much will make Redis slower. Keep your Lua scripts as small and fast as you can.
First learn redis, then learn to use the hiredis client. Look for books or tutorials that introduce redis. Or just read the docs if you're brave: https://redis.io/topics/data-types-intro
There's a redis-cli
you can use to interactively query redis: https://redis.io/topics/rediscli
Don't use KEYS *
, especially not on a production instance and one with lots of keys. Redis is single-threaded and one long-running command (like KEYS *
) can bring a lot of other connections to a grinding halt.
Instead try to use something like redis-cli --scan ... | sort | gzip > keys.gz
to produce a sorted, compressed list of all keys on the server. Most likely you'll see a pattern emerging for all the keys and then you can use the TYPE key:name
command (inside redis-cli
) to find out what some of the keys represent (sets, lists, zsets, etc.)
If the developers followed redis recommendations and used something like ":" as the separator for key-name components (think "/" for file paths), then you can use the following to get a general idea about the distribution of top-level key components (which is useful if you want to find out where most of the data is stored): zcat keys.gz | awk -F\: '{print $1}' | uniq -c
.
Happy hunting and, please, don't use KEYS *
(see Warning on this page: https://redis.io/commands/keys)
Just use incr on a key each page load... If you wanted to track hits by day just name the key the date or something and then have a process to collect the stats later. You can even expire the stats automatically by setting an expiration on the key.
I have allot of process performance counters that I do this to to track stuff like the number of users connected to a stream or number of lazy write processes running / waiting (I cache db writes so my clients do not have to wait for persistence and then have a script that is continuously checking for new inserts to run on the database). hyperloglog is another option, you could even use lists if you wanted to but incr is the way to go when tracking exactly one metric.
Turns out what I said is not correct.
https://redis.io/topics/cluster-spec
The reason for requiring a majority of masters appears to be to do with netsplits. When a master goes down, the remaining masters have to decide what replica to promote, however if there is a netsplit then it's possible that actually all the masters are functioning and connected to clients but unable to talk to each other. In that case, the minority side would be expected to stop accepting writes until they can reconnect to the other side of the split.
Replicas participating in majority consensus could result in two masters operating on the same key partition, leading to merge conflicts when they reconnect. Instead, if a majority of nodes are down, the cluster assumes it is on the small side of a split and shuts down to maintain data integrity for when they reattach to the potentially still running cluster.
I don't know what is NCache, so can't say if it is better or not.
For Redis you can store data on disk with a small delay, more is here https://redis.io/topics/persistence. So you can use it in "traditional database" fashion. Still getting advantages of Redis.
I heard about cases when it was used as a primary data source. But never tried for myself, tho. There are many ways to store such session data. Probably you can do some architecture review to consider better options. Since there is a lot of ideas:
Agree, documentation can more friendly.
Publish, Subscribe and PSubscribe are general commands. You can create your own channels, publish messages and subscribe to channels. The documentation for psubscribe addresses the general case where you can define any pattern you like.
Keyspace notifications are system generated notifications, and hence the naming convention is designed to not have conflicts with user defined channels. The specific patterns are described in this article - https://redis.io/topics/notifications.
16 is the default number of separate databases redis uses: https://github.com/antirez/redis/blob/unstable/redis.conf#L183
For more details check the SELECT command: https://redis.io/commands/select
I only use DB 0 for my applications. In some cases I use a DB other than 0 for unit-testing.
Hi,
There is actually a lot of good documentation on redis.io. Here is the quickstart guide that will walk you through installing Redis and using redis-cli:
https://redis.io/topics/quickstart
Also, there are packages for Redis on pretty much all Linux distros and it's available via brew on OSX.
Once you've familiarized yourself with using Redis in redis-cli you can start playing with hiredis. The hiredis API is quite good and should be easy to learn if you know C.
It takes a huge amount of load to use up any significant amount of cpu on a redis instance.
The high latency is likely from the inheritant slowness of the underlying virtual machine.
I suggest running some of the diagnostics listed in the docs on this page: https://redis.io/topics/latency to test for the intrinsic latency of the setup you have. If that is the case, you're somewhat out of luck. Newer vms with better virtualization at the hardware level often have a lot better latency as do bare metal machines, so upgrading your hosting might be the only thing you can do.
I have found a reference on incompatibility Note however that Redis Cluster 4.0 is not compatible with Redis Cluster 3.2 at cluster bus protocol level, so a mass restart is needed in this case.
Looking over the documentation of phpredis it looks like that version of set uses setex under the hood. https://redis.io/commands/setex
Setex uses seconds by default, so you're actually setting that key to expire in 2500 seconds rather than miliseconds.
Going by the docs here, to use the higher precision milliseconds api you'll want something in this form instead.
// Will set a key, if it does exist, with a ttl of 1000 miliseconds $redis->set('key', 'value', Array('xx', 'px'=>1000));
A late response, hoping it is still useful.
If you are using sentinels and your master/slave set is not fronted by a proxy that will auto failover your connections, you should use a sentinel-aware client (i.e. Redis driver) to connect. Check your client documentation to find out about it. For e.g. node_redis doesn't support sentinels yet but ioredis does. List of all Node.js Redis clients: https://redis.io/clients#nodejs
If you have already gone the polling the sentinel route as discussed in the other response, ensure that you read this doc (https://redis.io/topics/sentinel-clients). It has guidelines for Redis client developers on how to interact with sentinels and would be useful since you will be attempting something similar.
I understand using the latest version. In fact I help enable users to do that by building RPMs for redis 2.8 and 3.0 for the IUS project, whose goal is to provide newer versions of select software for RHEL and CentOS. The reason I ask is that our RPMs stick with the major release branches. For example, redis28u will always be the latest version of 2.8.x, and redis30u will always be the latest version of 3.0.x. Our policy is once a major version branch is considered "end of life" upstream, we retire the corresponding package. Regardless of the likelihood of security issues, we only care about what the upstream project considers "supported". If the 2.8 branch will still be considered for hypothetical critical fixes in the future, can that be documented somewhere with an end of life date?
Here is an example of what I'm talking about.
I think you are running redis outside of a docker container. See the instructions here
https://redis.io/topics/quickstart
"Installing Redis more properly"
It looks like you did the
sudo mkdir /etc/redis
sudo mkdir /var/redis
​
But are not running redis as root. You created the /etc/redis folder as root, but when running redis not as root (which is a good thing) it can't read files in that directory. You need to open up permission to this redis folder so that the user running redis (likely yourself) can read and write files in both of these directories.
Maybe you did, but that's also the default behavior of the PING
command, so please forgive my skepticism: https://redis.io/commands/ping
Because of that, the screenshot provided is indistinguishable from a responsive redis process.
Maybe share the contents of your redis.conf
file. If something is wrong, it might show up there.
You can use SORT for this type of thing. The movie data being hashes and the playlists being lists. Since you don't care about the actual sorting, just put as a non-existent hash field. Beware: You can't easily scale this past a single node.
> HSET movie1 foo bar biz buzz
(integer) 2
> HSET movie2 foo 1 biz 2
(integer) 2
> LPUSH movies movie1 movie2
(integer) 2
> SORT movies BY *->noop GET # GET *->foo GET *->biz
1) "movie1"
2) "bar"
3) "buzz"
4) "movie2"
5) "1"
6) "2"
Well, that's what the official Redis F&Q said. https://redis.io/topics/faq
"It's not very frequent that CPU becomes your bottleneck with Redis, as usually Redis is either memory or network bound."
It's not going to be entirely straightforward as lists and sorted sets in Redis have add and remove operations that are O(log n) or O(n).
O(1) access by value is simple - just use GET
and SET
(or HGET
and HSET
).
> I need to implement something like classical LRU, but without eviction policy and moving element in the front of the list on access.
It might help to understand a little better what your use for the list is. This doesn't sound anything like LRU really when you're not tracking recently used keys or removing anything at all, or if at all by a fixed expiration time rather than usage.
You can build a linked list with Redis's Hashes. A node in your list would be created like HSET node.id value node next nextNode.id prev prevNode.id
. The 'id' of your node's becomes your pointer, essentially. And you'd store the head and if you need tail of your list separately, like SET head node.id
. You'd probably want to prefix your keys rather than just using "head" or node.id
- like "mywidgetlist.head" and 'mywidgetlist:' + node.id
.
You can achieve O(1) add & remove with that, and retrieval by key is a simple O(1) HGET node.id value
.
All of the data structure commands can be seen here: https://redis.io/commands
>INCR
I'm trying to understand exactly what you means.
We are talking about this right? https://redis.io/commands/INCR
From what I see it looks like it can't be used in Keys but only in Value, do you have an example of working query?
Thanks for the help! :)
This sounds like you want a list. You don't have to know how many elements are in the list, RPUSH adds the provided value as a new item at the end of the list. To illustrate:
(the key named 'thislist' has no elements yet)
RPUSH thislist foo
('thislist' now has 'foo' in position 0)
RPUSH thislist bar
('thislist' now has 'bar' added, in position 1)
RPUSH thislist baz
('thislist' now has 'baz', in position 2)
You can see that the position (index) for the new value automatically increments when you use RPUSH. If you think of the index as a kind of extension of the key name, it's very similar to the autoincrementing key name that you asked for.
You can read elements in the list by their index with LINDEX, you can remove elements from the list in the order they were written with LPOP, or reverse order with RPOP. You can change the value at a particular position with LSET, and there are other commands available to add/change/remove values in a list. The redis.io commands page describes them.
Firstly, Redis 3 is pretty old! If you haven't got a strong reason to stick with it, I'd recommend upgrading to Redis 5, or at-least 4 if you want it to be the most production-hardened version (https://redis.io/download)
Can you share more info about your config? - In particular, what's non-standard about it?
Testing with just
$ redis-server -
unixsocket /var/run/redis/redis.sock
port 0
works no issue for me (Redis 5.0.5)
The (error) ERR unknown command '0'
is particularly suspect, as it shouldn't be interpreting '0'
as a command...
Have you read the comments around the bind and port keywords in the redis.conf file?
Here are those sections in the example redis.conf file that's distributed with the source code available from https://redis.io/download:
################################## NETWORK #####################################
# By default, if no "bind" configuration directive is specified, Redis listens # for connections from all the network interfaces available on the server. # It is possible to listen to just one or multiple selected interfaces using # the "bind" configuration directive, followed by one or more IP addresses. # # Examples: # # bind 192.168.1.100 10.0.0.1 # bind 127.0.0.1 ::1 # # ~~~ WARNING ~~~ If the computer running Redis is directly exposed to the # internet, binding to all the interfaces is dangerous and will expose the # instance to everybody on the internet. So by default we uncomment the # following bind directive, that will force Redis to listen only into # the IPv4 loopback interface address (this means Redis will be able to # accept connections only from clients running into the same computer it # is running). # # IF YOU ARE SURE YOU WANT YOUR INSTANCE TO LISTEN TO ALL THE INTERFACES # JUST COMMENT THE FOLLOWING LINE. # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ bind 127.0.0.1
# Accept connections on the specified port, default is 6379 (IANA #815344). # If port 0 is specified Redis will not listen on a TCP socket. port 6379
The syntax for the "bind" parameter is only the IP address. The port number is declared on another line with the "port" parameter.
you can specify expiration time with the default write operation on top level items
​
​
items in a hash or sorted set do not support this, though.
​
another option off the top of my head is to store the time along with the cached data, then have your caching logic compare the current time against the stored time on fetch then hit the underlying database if the difference between the two times exceeds your desired threshold
​
is your goal to flush the cache every X hours or is it just that you want to force a refetch from the database every so often to ensure the data is fresh
Open source. Enterprise only takes management of the cluster off your hands. If you know as much as you already do it isn't much more till you can manage it yourself. Clustered redis is fairly self maintaining. https://redis.io/topics/cluster-tutorial It may be a long read, but understanding it will help you understand the limitations of databases.
From the Redis Cluster tutorial at https://redis.io/topics/cluster-tutorial, the first section below the introduction is: > Redis Cluster 101 > Redis Cluster provides a way to run a Redis installation where data is automatically sharded across multiple Redis nodes.
The fundamental reason to use the sharded Redis Cluster is sharding, which is the technique of distributing the keys in the database among multiple servers.
As the sentence above implies, there's another way to run Redis which does not distribute keys among multiple servers. This non-sharded configuration does not have a special name, it's just plain Redis. Since it's not sharded, all the keys are together in a single node.
However these two configurations are not compatible with each other. I.e., you can't set up the sharded Redis Cluster as a single node and later expand it by adding nodes. You have to start with at least three nodes, as the tutorial says farther down in the page:
> Note that the minimal cluster that works as expected requires to contain at least three master nodes.
So the answer to your question is, "No, you can't set up a single node Redis Cluster." You can set up a single node, but it will not be the sharded Redis Cluster, it will be the non-sharded Redis.
For the connected client's IP address, have a look at the INFO command, specificially "INFO clients": https://redis.io/commands/info There's also CLIENT LIST. And the operating system's facilities for seeing the details of connections to port 6379.
Is there a way on the server to know if a connected client is redis-cli instead of other client software? Nope.
I'm aware of the tests that come with the Redis server source code tar file: https://redis.io/download
I don't know about others, but I presume the authors of Redis clients have some collection of tests they use. The ones that are packaged as open source may have something useful.
Your redundancy comes from configuring replication in Redis. Whether you deploy the replicated instances via Swarm or some other mechanism doesn't matter - without replication configured in Redis itself, you don't have any redundancy.
If you configure Redis to be clustered, this means that each node contains a subset of the data, and the client must connect to the node which contains the data it requires.
Many clients, such as StackExchange, do this automatically for you.
It does mean your client needs access to all of the nodes.
You can read more about clusters here. There's a section with some specific details for running a cluster in docker.
> Was this sharded?
Yes. You have a cluster of 3 nodes, so your data is sharded across the three nodes. Redis uses hash slots to determine which shard a specifc key is stored on. You can read the details on how clustering works here
The second seciton of the Redis Cluster Spec Document explains the server's "MOVED" reply:
> Since cluster nodes are not able to proxy requests, clients may be redirected to other nodes using redirection errors -MOVED and -ASK. The client is in theory free to send requests to all the nodes in the cluster, getting redirected if needed, so the client is not required to hold the state of the cluster. However clients that are able to cache the map between keys and nodes can improve the performance in a sensible way.
Redis Cluster shards your data set. In other words, it divides your data set up among your three master servers so each master has about 1/3rd of the keys. The point here being that the master does not have the other 2/3rds of the keys.
As the document explains, the Redis servers (masters) do not proxy your request over to the server that has the key you're asking for. Instead, it returns a redirection reply, "MOVED" or "ASK", which informs your client it should talk to one of the other servers to read/write that key. The "MOVED" reply usually suggests the correct server to talk to.
Apparently the client you're using does not support Redis's sharding Cluster mode, because instead of connecting to the suggested server, it throws an exception back to your code. Or perhaps the client supports Cluster mode, but you haven't told it to use that mode.
consider using MONITOR command in a non!(low traffic) production environment...
https://redis.io/commands/monitor
have found quite some interesting calls with that in the past.
From the documentation at redis.io, the expire command: https://redis.io/commands/expire
3/4ths of the way down the page is the section titled "Appendix: Redis expires" with a description of how Redis keeps expiration information as the timestamp a key becomes invalid (expires) and the effect of dramatic time shifts:
Keys expiring information is stored as absolute Unix timestamps (in milliseconds in case of Redis version 2.6 or greater). This means that the time is flowing even when the Redis instance is not active.
For expires to work well, the computer time must be taken stable. If you move an RDB file from two computers with a big desync in their clocks, funny things may happen (like all the keys loaded to be expired at loading time).
Stabilize your servers time with NTP or equivalent, and consider using RDB Tools (https://github.com/sripathikrishnan/redis-rdb-tools) to recover an hours-old or days-old RDB dump file without keys expiring.
The Sentinels would be the ones detecting trouble with the master Redis process and deciding which slave to promote to master. Have you looked in the Sentinel log files?
The Redis website has a fairly good collection of documentation, including several pages on Sentinel. A good place to start is https://redis.io/topics/sentinel which has examples of how/why Sentinels could believe a Redis master process has become unavailable.
Nice summary, pity about the plug in the bottom line.
Speaking as an OSS Redis Geek, no extra tooling needed as of Redis v4 - just call the (awesome!) <code>MEMORY USAGE</code> command ;)
The description of the SELECT command explains a lot about the databases: https://redis.io/commands/select
As the command description mentions, it was added in v1.0.0 (Sept 2009), which was quite early in Redis's history (March 2009 until today). It is not supported in the sharding version of Redis (named Redis Cluster), and there has been debate in the past about whether it's a useful feature, especially in production environments. I.e., some people feel it's useful, others feel it's not useful.
Why not just use different database numbers in the same redis instance? http://www.rediscookbook.org/multiple_databases.html
There’s no intrinsic security in having separate processes given that the whole security model of redis is predicated on having it running in a trusted env: https://redis.io/topics/security
The Redis documentation calls the elements in a hash "fields". When your first post said "I'd like to delete multiple hashes at once", it sounded like you were asking to delete entire hashes (keys).
In the HDEL command description at https://redis.io/commands/hdel I'm not seeing an indication that the given field names can be patterns that Redis will expand into literal field names. It looks to me like the sequence of actions would be:
This is the kind of thing that Lua could probably simplify into a single command for you.
First, what version of the Redis server are you using?
Second, the examples at the bottom of https://redis.io/topics/notifications don't seem to have the simple 'PSUBSCRIBE *' syntax you describe in your question. Are you trying to use ordinary Publish/Subscribe syntax for notifications/events?
Ok thanks for the help. So the only reference to swap I can find in implementing is in what is labeled a depreciated page: https://redis.io/topics/virtual-memory. Yet the admin page suggests swap. Has anyone used swap in redis?
https://redis.io/topics/sentinel#sentinel-commands. Look at the SENTINEL FAILOVER command. Why deploy sentinels and try to do tasks that it can by hand? But ultimately it is your choice on what strategy you wish to follow. 👍🏼
I just started learning about redis in depth, and the book I'm reading right now has an example of a time series written in javascript. I can't really say how good the information is because I'm still new to redis.
>In this chapter, a library in Node.js will be created to exemplify how to implement a time series in Redis using the String, Hash, Sorted Set, and HyperLogLog data types. This library records events per second, minute, hour, and day. It also provides query functions to retrieve the data over time.
>Later on, we will make this library memory-efficient using Hashes instead of Strings, and also add a feature to store and search for unique events in a given timestamp range using Sorted Sets and HyperLogLogs.
(if you google you can find MUCH cheaper copies..)