diff -rubB /usr/src/linux-2.2.16/net/ipv4/ip_masq.c gatek/net/ipv4/ip_masq.c --- /usr/src/linux-2.2.16/net/ipv4/ip_masq.c Thu May 4 02:16:53 2000 +++ gatek/net/ipv4/ip_masq.c Mon Jul 24 17:10:27 2000 @@ -1982,11 +1982,12 @@ * ... don't know why 1st test DOES NOT include 2nd (?) */ - if (skb->pkt_type != PACKET_HOST || skb->dev == &loopback_dev) { - IP_MASQ_DEBUG(2, "ip_fw_demasquerade(): packet type=%d proto=%d daddr=%d.%d.%d.%d ignored\n", + if ((skb->pkt_type != PACKET_HOST || skb->dev == &loopback_dev) && skb->fwmark == 0) { + IP_MASQ_DEBUG(2, "ip_fw_demasquerade(): packet type=%d proto=%d sadr=%d.%d.%d.%d daddr=%d.%d.%d.%d ignored" + " - dev=%s, fwmark=%d\n", skb->pkt_type, - iph->protocol, - NIPQUAD(iph->daddr)); + iph->protocol, NIPQUAD(iph->saddr), + NIPQUAD(iph->daddr),skb->dev->name,skb->fwmark); return 0; }
Old text:
TODO: I plan to make it hierarchical like CBQ and add bounded
and isolated functionality but only if I will see that someone
wants it.
Here is patch for 2.2.15 and for
tc. Bugs to devik@cdi.cz.
Have a fun !
This is most recent qdisc. It superceedes WRR and classfull
TBF. It is prioritized DRR plus TBF in one.
It can be used instead CBQ because CBQ impl. in linux has
many drawback (low accuracy and high complexity).
You create HTB using:
tc qdisc add dev eth0 root handle 1: htb
It creates qdisc which has no classes now. So that create some:
tc class add dev eth0 classid 1:1 htb rate 100kbit burst 10k prio 1
tc class add dev eth0 classid 1:2 htb rate 100kbit burst 10k prio 0
Not attach filters to 1: and you are done. Each class acts as small
TBF but classes are prioritized and each class can borrow unused
BW from others. (tc -s -d shows it).
It is very acurate, but there is the same rate limit as for TBF
(read comments in both TBF and HTB sources).
CBQ rates can differ wildly from your expectation (because they
borrow from parents and don't account it to children) but
in HTB it will work exactly how you want.
You need to set only rate and burst. Priority is default 0
(highest), you can change it to 0 .. 3. Unused bandwidth
is divided by DRR - default is that each class has DRR quantum
1500. It means that unused bw is divided equaly not proportionaly
to rates. You can change it using quantum parameter (not implemented
in tc yet).
I use HTB currently for sharing on wireless link. Using
priorities I can perfectly rate limit to 256kbit and still
get 1ms RTT for small prioritized packets on 100% utilized link.
Along with SFQ at leaves, the link goes like charm ;-)
Thanks to these problems I have been forced to find another way
how to handle traffic in our company.
In linux there is good implementation of TBF (token bucket filter).
Hence one could create TBF and insert WRR (weighted round robin)
scheduler into it. TBF will be set to rate which we can use
on shared medium and quanta sizes in WRR will do correct division
into flows. For WRR scheduler see bellow.
But linux's TBF can't use inner queuing discipline for TBF. Instead it
always uses internal fifo queue. So that here is patch to sch_tbf.c for
2.2.15 kernel. Now it supports optional inner qdisc hence
makes whole thing much more flexible.
The patch has still two TODOs described in code. It adds support
for classes in TBF or to be more precious it will create exactly
one class named X:1 where X is qdisc handle.
Unfortunately it triggers bug in iproute2/tc code, so that here
is diff which fixes it:
--- tc_class.old.c Tue Jul 11 21:32:21 2000 +++ tc_class.c Tue Jul 11 11:52:00 2000 @@ -230,7 +230,7 @@ } if (t->tcm_info) fprintf(fp, "leaf %x: ", t->tcm_info>>16); - if ((q = get_qdisc_kind(RTA_DATA(tb[TCA_KIND]))) != NULL) + if ((q = get_qdisc_kind(RTA_DATA(tb[TCA_KIND]))) != NULL && q->print_copt) q->print_copt(q, fp, tb[TCA_OPTIONS]); else fprintf(fp, "[UNKNOWN]");Because TBF sometimes needs to delay already dequeued packet I used internal queue to hold such packet until next dequeue event. It also minimized changes into original code. The scheduler works in the same way as before until child qdisc is attached.
Note that original TBF could introduce delays at most limit/rate. Now when you use prio scheduler as inner qdisc the average delay for packet in high priority band will be avg_packet_len/rate/2.
So that using classes' quantum we can affect both bandwidth ratio between classes (by quantum ratios) and packet delay (by quantums' absolute size).
I hope this qdisc become standard part of linux kernel as it fills gap between simple prio scheduler and complex cbq one.
sch_wrr.c - to compile the thing you have to add appropriate line into net/sched/Makefile. If sched code maintainer will consider my patches/code useful I will merge the code into 2.5 branch. q_wrr.c and Makefile are patches for TC tool (iproute2).
I tried to hack the kernel in such manner but I have not some important infos. For example I don't understand why there is both dev->qdisc and dev->qdisc_sleeping. I need to understand it before I start to implement real ingres qdisc.
Temporary hack
Because we need to do ingres shaping just now, I hacked 2.2.15 kernel
a bit. I added new field into device (netdevice.h) which controls what
to do with incoming packets. The field is controled using
ifconfig IF metric N (because the metric ioctl was unused).
When the N is nonzero, all incoming packets from such device are
resent to loopback device. It of course do not apply to packets
which originated from the loopback and also ARP packets are excluded
(because ARP needs to know original device where packed arrived).
Now you can attach any qdisc queue to the lo device and all incoming
packets goes thru it.
To distunguish between localy generated packets and our fictive
packet, each fictive packet is marked by setting its fwmark to
the value N. One can use fw filter to assign them to different
class (queue).
There shouls be no problem with locking as the net_bh in 2.2 kernel
can be interrupted only by HW.
The hack is a bit dirty but works. I would not expect such feature
in kernel but if you need it, here is the patch.
Martin Devera <devik@cdi.cz>