There aren't obvious resources to teach architectures at this level; this is experience and exposure and inference.
We commonly use reverse proxies in front of app servers for a wide variety of reasons. Often the reverse proxy is actually on the same machine/instance as the app, if that accomplishes the goals. CORS, recently, and also and combining multiple discrete apps into one virtual hostname, are applications.
Security is another. One site I know has a policy of never using IIS without a reverse-proxy with tight security in front of it, after a history of problems with IIS. The reverse proxy always has to handle the TLS ("SSL") either way, but a reverse proxy can transparently add TLS/HTTPS to an app that doesn't support it.
Then there's performance. The most popular web servers and dedicated reverse proxies are coded in tight C and many of them compete for speed and low resource usage. Using an evented web server to serve all static files and only passing through app requests to a Tomcat or Undertow servlet container can reduce resource usage and improve performance.
But in the end the reverse proxy is a wrapper that lets us control everything. If you have a recalcitrant vendor app that doesn't issue proper cache-control headers and never rewrites URLs properly, you can put a reverse proxy in front of it and fix that right up as far as the client is concerned. Once you're comfortable with the architecture you use it constantly.
Caching is your friend. If that doesn't cut it, varnish (or any other http caching app) will make your site as fast as a static site by saving static html versions of the pages being viewed and subsequent requests for this url will be served the cached html version.
Hi, I guess these will make a good basis for a FAQ section :)
> just using CloudFront
CF is a good thing but it's still Amazon and it shares the same problem that forces people to avoid cross-datacenter traffic to S3: when the amount of outbound traffic and request rate go up, the Cost of Service may rise insanely high. You can check out this calculator and see what's the monthly cost of transferring 500kiB files with 100/s sustained rate. S3 is about $10K+ and CF just doubles that.
https://calculator.s3.amazonaws.com/index.html
> or deploying Varnish servers
Varnish is not zero-config, someone should support it, it has a scripting language and a couple of books written about it.
Varnish is complex, so it has bugs like this https://varnish-cache.org/security/VSV00001.html#vsv00001
Objstore on the other side is very simple inside, it just serves the files using io.Reader or io.ReadSeeker, backed by os.File or a file body from S3.
The projects have different feature sets — Varnish doesn't handle writes;
My favourite: «Varnish High Availability is available as part of a Varnish Plus subscription. If you’d like to get these components up and running now, please contact one of our Varnish Plus Sales Executives»
I hope there is enough reasons to have a good replacement for Varnish implemented in pure Go and designed to be failure-proof for free.
Seconded. If I were OP and wanted to optimize the website, I would wait until I have a site set up, measure the cost of the bits in question, and have determined that there is a clear need to have those bits removed/refactored. If/when that is determined, I would make a pull request to have it included in the WP core.
If OP is talking about public facing, downloaded assets like CSS, JS, or images, it is likely that any cost associated with those bits that you are concerned about will likely be mitigated by including a caching layer like Varnish or a CDN.
If OP are talking about limiting backend functionality, then there are likely other approved and standard ways to do it.
Forking WordPress could be a interesting academic side project but there's going to be so much additional work and upkeep that I would never consider it for any production website.
Everyone should read this post..
Tldr is you trying to keep a file in memory may be slower than writing to disk. Because “writing to disk” doesn’t actually write to disk thanks to the page cache.
One helpful improvement (pretty easily applicable in Varnish) is having a grace period.
Have an interval where data is still served from cache but in the background it is getting updated so on any moderately used resource client never sees the "wait time" when miss happens
Basically, instead of having TTL of say 600s you have TTL of say 580s + 20s of grace, where clients are served "soon to be stale" data and request for new one is running in the background. Possible improvement would be adding probabilistic element (i.e. the closer to TTL you are the higher chance is that cache will chose to refresh the element eariler than the TTL) to lower the chance of refreshes syncing with eachother.
I'd have a look at the Varnish VCL and nginx configs and see if there is something fucky going on. Maybe the previous owner put some misguided directives in there.
Also see if logging and reporting gives you something: https://varnish-cache.org/docs/6.2/users-guide/report.html
Easy solution: stick it behind Cloudflare and let them worry about the caching of pages as well as many other things like DDOS protection or serving stale pages if your back end goes down.
Otherwise look at Varnish for an on server solution.
With both of these you will need to make sure you correctly set the cache headers on responses if you are not already.
While above solution is outstandingly helpful, it needs one addition: non-ssl traffic will not get anywhere. You also need to redirect (or proxy) normal traffic (port 80) to either respective machines directly or to ssl versions.
First solution involves duplicating all "server" pieces of config by /u/CollateralFortune and changing "443 ssl" to "80" (ie.: listen 80 http2;).
The second solution:
server { listen 80 default; server_name *.kodysalak.com; access_log off; error_log off; ## redirect http to https ## return 301 https://$server_name$request_uri; }
And don't forget to add port 80 to nginx ports.conf.
There are obviously many other solutions to this. One that comes to mind is varnish. Varnish is a high performance http cache/proxy solution. It can redirect traffic from port 80 to port 443 and after nginx terminates SSL, it can cache and proxy http traffic to appropriate machines. Let me know if you need further details.
This is a much used pattern when implemented with Varnish and ESI. It also allows you to expire specific resources / components from the cache, and lets you attach a Vary
header to the resource to deliver multiple versions to different users.
If your content is dynamic, you can achieve this with a web accelerator like varnish which is a reverse proxy cache.
A request comes in and if your cache has the page in memory, it serves that and doesn't even touch your application server. If it doesn't have it cached, if passes the request to the application server and then caches it for next time. Depending on the content, you can set the cache time to various amounts and even integrate cache invalidation into your application code.
With this setup, you can get up to 1000x increase in server performance depending on your backend.
> So although i understand your argument to use POST for this, I need to introduce something that will not cause conflict with this framework convention
Let me put it this way.
Let's say you're passing a non-idempotent action through ~~PATCH~~ PUT (EDIT: typo), say "increase game character magic skill +1".
For some reason your server, or some router along the way or god knows where, traffic slows down and the command is either lost, or the response never comes back.
So an intermediary sees "PUT" and decides "I'm not getting a response, so this means I can re-send, as PUT is idempotent". The intermediary is simply following HTTP, according to spec.
Turns out the message was not lost, just slowed down, so two copies of "increase game character magic skill +1" arrive at the server. Suddenly the user gets +2 increment in magic skill and not +1.
> One question, the framework is trying to follow REST. Correct me if I'm wrong but wouldn't POST on update actions contradict the RESTfulnes ?
No, but sending non-idempotent actions over PUT would contradict HTTP itself, and you risk actions being sent multiple times to your server in certain conditions.
Actual REST, as well as actual HTTP, are not a matter of conventions, but a matter of specifications. And when you don't follow the specifications, it has implications for whether your app will behave correctly.
Also PUT implies you're replacing the resource at the URL with what you're sending, not merely sending partial updates. If you're sending partial updates, then products like Varnish HTTP Cache will misbehave as well, thinking your "update" is the actual resource to return next time.
https://twitter.com/dormando/status/1402466173778677764
> Lol'ing hard at this header image. I changed the varnish default "Guru Meditation" to Mediation a million years ago for reasons that no longer matter. I bet that's the only change I made that's still live.
It was so they could distinguish Fastly's Varnish errors from upstream Varnish errors. Not really needed now because it's distinguishable by a different error code format, but evidently nobody bothered reverting it.
Bonus: PHK, lead Varnish developer, talking about why the error page is so terse.
You can use an HTTPS loadbalancer or proxy tool in front of varnish. But you lose the benefit of a CDN in this setup with a single geographical node.
nginx, haproxy and others can do the HTTPS side, varnish docs mention HAProxy:
This is a good point. You could do authentication and authorization (steps 8 and 9) ahead of 'current representations' and '(proactive) content negotiation' (steps 6 and 7). The trade-off here is that you have less information to feed into your authorization algorithm. It's the authorization step that's going to restrict access (and you only actually need to authenticate if your authorization logic takes the 'subject' into account).
However, if you're serving a public resource, authorization isn't going to help restrict requests. If you're concerned about DDoS attacks, and aren't relying on infrastructure upstream of you to manage those, you'd focus your mitigations in Step 2. For example, you could use something like https://github.com/vladimir-bukhtoyarov/bucket4j to create rate-limited buckets on IP addresses or ranges.
If you're worried about performance then put a cache in front of your server (such as https://varnish-cache.org/), and focus on providing validators in your representation metadata (etag, last-modified) to improve cache hits. This is going to give you much more improvement than tinkering with the ordering of these steps.
For a cache, you could go the varnish way ( https://varnish-cache.org/ ).
Varnish doesn't try to implement a complex userland LRU mechanism, it just let the kernel swap whatever it sees fit. It's always faster to get the data from the local swap than to fetch the upstream server (or trying to write/read the cache to/from the disk in userland).
If you're managing the hosting of your blog, have you thought of installing a Varnish cache on top of your blog? Varnish is fast and scalable for static content and it would save you time and effort into migrating content to another platform. Some links: - https://varnish-cache.org/ - https://kruyt.org/ghost-blog-caching-with-varnish/
You might want to try getting a VPS hosted close to the target country. Then you could run something like Varnish on it to act as a front-end caching reverse proxy.
Finally, you could update your DNS records to resolve to the closest server to the requester. IIRC it's called anycast DNS? I can't remember.
Anyway, see also this article.
If two systems are managing a memory/disk cache, they end up fighting under pressure. The OS pages out the RAM, then the application realizes it's unused, and pulls it back in to write it out to disk. In general, it turns out to be better to let the OS, with a view of the overall system, manage what part is in RAM.
Much longer version: the Varnish architect notes.
There are things that stays that way for a long time, i.e. a blog post, or at least for long enough. If it takes some work to render them, caching it will help a lot in a busy site. But as you say, there are some things that are for a particular logged in user (that should not be cached) or very time dependent, and you may have a configuration to not to cache them.
Regarding your objection, if you will say the day or hour, but that depends on where the visitor comes, you can put a GMT time (with some resolution, like minutes or something caching friendly) and with javascript show the right time using their local time.
Also, a web page is composed by a lot of elements, some very dinamic, some not so much. You can use Varnish to optimize those cases with Edge Side Includes, rendering a composite html page from different endpoints with different expires policies.
Varnish once wrote about their method, which is basically to just let the os handle the paging because doing so manually and fighting the allocator is a recipe for disaster and accomplishes the opposite of what you want.
Varnish HTTP Cache is software you place in front of your website that assists in caching results, supposedly to help speed up the site for end users.
Why it's returning that error I'm not sure, but it seems like Varnish is working fine but their website backend is down.
Set up Varnish Cache on the VPS using your home NextCloud as the backend.
Varnish will site between the Internet and your NextCloud, adding requests to its cache in memory as they are made. The first time a file or page is requested, it will be slow as it's served from the backend. Subsequent requests for the same file or page will be lightning-fast served from Varnish's cache.
You can just set the cache's TTL (Time To Live) to a very high value - like hours or days - to tell Varnish to keep the items in memory.
> In what way?
However you configure it to. Personally, I use Drupal, and what's most common is to have it cache the full generated HTML of a page. Meaning, <html>
to </html>
.
Of course, for logged in users this doesn't really work. For this, Drupal can just cache page elements too, like Views, Nodes, Blocks, etc.
The whole thing should be behind a CDN or caching proxy anyways. Personally, I like Varnish.
Hey!
I'm not sure that swap space will help you in a traffic spike.
The solution to your problem depends, in part, on whether the traffic is legitimate or not.
If it is, then consider these options: - Put your box behind Cloudflare. It will, if configured correctly, cache the PHP output from WordPress and serve it directly, thereby reducing requests. - Use Varnish. Same as above,but self-hosted. Executing PHP is expensive, so you don't want to be doing it for every request ideally.
If not, then additionally consider these options: - Setup SSHGuard or fail2ban. - Block the IP addresses attacking your server by some other method - preferably automatically.
The usual data flow for a request arriving would be:
Teh Interwebz -> (Optional Load Balancer) -> Web Server -> Application Server -> Your Rails Code
The web server, NGINX in your example, can efficiently manage things like terminating SSL traffic, efficiently listening on many sockets, serving static files (images, CSS, etc), and then routing requests further. For example, a more complex website might actually be several small apps, separated by subdomain or path. NGINX could efficiently forward on traffic to the correct application server.
An application server is more interested in running your code, and is either the-same-as, or tightly coupled to, the language of the application it will run. In your example, that is Puma and Ruby.
Especially with Ruby the application server will be slower and/or require far more RAM than the web server for a given volume of traffic. So another reason for their separation is that you might only need to run one web server, but be running several copies of your application server at once. On a particular server it is reasonable to imagine one core/thread for NGINX, and four or six cores/threads of application server(s) running. Puma specifically is multi-threaded, where some other Ruby application servers just fork.
Caching can happen at each layer, depending on exactly how things are configured. If you specifically use Rails' caching features, then the cache is at the application server / Rails level. For the most part, Rails is unconcerned with caching static files, but can indeed cache rendered views, fragments, and the like. For particularly high-traffic applications, there is often an entire caching layer added "in front of" NGINX; something like varnish.
sorry that was poorly phrased on my part. h2 requires ssl, or not exactly, no browser supports unencryped ssl which amounts to the same thing. as far as the warning goes, its just going to train people to ignore the warning, doing more harm than good
https://varnish-cache.org/docs/trunk/phk/http20.html EDIT: may only be on pages with a login form or credit card field if so that's cool.
How about using a simple MVC framework like Laravel as you said, but with a heavily caching server in-between, something like Varnish HTTP Cache. Then it would only make requests to the MVC app only when necessary, your server could then invalidate caches when the page is updated.
php7 is far faster than 5.6
using xampp uses the correct php binary
dev mode will always be slower than prod, as symfony collects a TON of useful profiling information on every single page load. Which you can explore in the nifty tool bar and the profile pages.
a cold cache in dev mode means youve changed something since the last page view, and Symfony will now automatically rebuild all of its caches, which also takes precious milliseconds. A warm cache is one that doesn't need to be invalidated or rebuilt, so Symfony will reuse the assets its already built. Its the warm cache dev that will be a halfway decent indicator of performance in prod, though still not quite. When it comes to prod, the cache is very sticky. Updating controllers and templates especially in prod mode may be reflect because Symfony wont re-build the caches until it's told to or absolutely has to (say, deleting the cache folder). So in prod mode, most everyone requesting your site, will be hitting a warmed up cache with no dev tools, and ends up being extremely performant under pressure.
The moving harddrive will hinder performance, especially in Dev as it's writing files all the time, and reading them too. If you deploy to something like Digital Ocean for prod (for isntance), which uses Solid State, your looking at a pretty ideal situation.
For even better prod performance, you may want to look at a static caceh server. But you shouldn't need that unless your getting a ton of traffic that may cripple your servers resources.
Have a look at this first: HTTP Cache in Symfony before doing anything drastic.
Then look at this: Symfony Performance
All else fails, you can ways sit Varnish in front of it.