3troubleshootingjunos
.pdfTroubleshooting JUNOS Platforms
|
|
|
Reproduction |
|
|
|
|
|
|
||
|
|
JTAC Knowl dge Base Case Study: Part 2 |
|||
|
|
The slide shows a portion of the results returned from the Knowledge Base search. |
|||
|
|
The highlights sugg st that ID number 1971 is quite promising in that it seems to deal |
|||
|
|
with temperature threshold values J Series platforms. |
|||
Not |
for |
|
|
|
|
|
|
|
|
Troubleshooting Tool Kit for JUNOS Platforms • Chapter 3–65
Troubleshooting JUNOS Platforms
|
|
Reproduction |
|
|
|
|
|
||||
|
|
JTAC Knowledge Base Case Study: Part 3 |
|||
|
|
The contents of ID numb |
1971 are helpful, and with your newfound wisdom, you are |
||
|
|
well on your way to J S ri |
s Guru status. |
||
|
|
Note that you can provide feedback to the maintainers of the CSC to indicate whether |
|||
|
|
for |
|
|
|
|
|
pa ticular ent ies were helpful to you. |
|||
Not |
|
|
|
Chapter 3–66 • Troubleshooting Tool Kit for JUNOS Platforms
Troubleshooting JUNOS Platforms
|
|
|
Reproduction |
|
|
|
|
|
|
||
|
|
Best-Practic s Case Study |
|||
|
|
The slide highlights the topic we discuss next. |
|||
Not |
for |
|
|
|
|
|
|
|
|
Troubleshooting Tool Kit for JUNOS Platforms • Chapter 3–67
Troubleshooting JUNOS Platforms
Not
Relying on in-band m thods to manage your network might seem like a good idea up until the point that a circuit or hardware outage prevents you from accessing your network, and as a result, olongs corrective actions. We highly recommend deploying
foran out-of-band management network because it provides you with a back door into your netwo k du ing times of outage or disruption.
Deploying an ReproductionOut-of-Band Management Network
All JUNOS platfo ms come with a built-in out-of-band interface in the form of fxp0. Note that xp0 is an out-of-band interface because transit traffic cannot be routed over this inter ace. Put another way, if a packet arrives on fxp0 it can never egress on a PFE inter ace, and vice versa. Because of this behavior, we do not recommend running a routing protocol over the fpx0 interface in most cases. Instead, we recommend a static route flagged with no-readvertise. This flag ensures that the static route used out-of-band connectivity does not advertise over any routing protocol.
We also recommend the use of a backup-gateway, especially when your hardware supports redundant REs. You use the backup gateway entry whenever rpd is not running, such as in the case of a backup RE or a system that has had rpd shutdown because of thrashing.
Continued on next page.
Chapter 3–68 • Troubleshooting Tool Kit for JUNOS Platforms
Troubleshooting JUNOS Platforms
Deploying an Out-of-Band Management Network (contd.)
|
Your out-of-band connectivity should provide both Ethernet (fpx0-based) and console |
|
|
access to your routers. You normally gain console access through some type of |
|
|
terminal server. We recommend console access whenever you perform serious |
|
|
maintenance activities, like upgrading or downgrading the system software, because |
|
|
if something goes wrong, or the system somehow returns to a factory default, you |
|
|
might no longer have Ethernet-based access to the system. Having console access is |
|
|
|
Reproduction |
|
the only way that you can reload software from removable media or recover a lost root |
|
|
password. |
|
Not |
for |
|
|
|
Troubleshooting Tool Kit for JUNOS Platforms • Chapter 3–69
Troubleshooting JUNOS Platforms
Not
|
A chiveReproductionlogs: You should configure syslog archive settings that |
|
• |
ensure |
|
Recommended Syst m Log S ttings |
||
Wherever possible, you should place the following system logging recommendation |
||
into effect: |
|
|
• |
Use a remote syslog host: This recommendation helps in archiving syslog |
|
|
messages, and ensures that these valuable messages are available even |
|
for |
||
|
in the event of a catastrophic failure of a router. |
|
|
retaining entries for at least two weeks. This suggestion is especially |
|
|
important when remote system logging is not in place. We recommend |
|
|
configuring 20 copies of the messages file with each copy being at least |
|
|
1 MB in size, except on J Series routers, which have limited storage |
|
|
space. |
|
• |
Log CLI commands and configuration changes: We have all seen the joke |
|
|
about what to do if you break something while no one is watching—just |
|
|
walk away. While this advice is perhaps sound, it is futile when the |
|
|
system configuration logs interactive CLI commands. When combined |
|
|
with unique user logins, the logging of all commands issued on the |
|
|
machine provides an excellent audit trail of who did what, and when. |
Chapter 3–70 • Troubleshooting Tool Kit for JUNOS Platforms
Not
Troubleshooting JUNOS Platforms
SynchronizeReproductionRout r Clock
We recomm nd using the N twork Time Protocol (NTP) to synchronize all routers to a common, and f rably accurate, time source. By synchronizing all routers, you ensure that time stamps on log messages are both accurate and meaningful, which is
forespecially important when conducting security-related forensics where you must co elate events that might have occurred on numerous machines.
JUNOS Software Needs a Reference
The basis for the NTP protocol is a series of timing hierarchies, with a Stratum 1 (atomic) timing source at the very top. While accuracy is desirable, you do not need to synchronize to Stratum 1 reference to benefit from having synchronized views as to the time of day. JUNOS Software cannot provide its own timing source because it does not support the definition of a local, undisciplined clock source (for example, the local crystal oscillator). If needed, you can always obtain a commodity UNIX device of some type with a configuration that provides a timing reference based on its local clock. Again, remember that any synchronization, even if based on an inaccurate local clock, is better than none.
JUNOS Software supports client, sever, and symmetric modes of NTP operation, and can also support broadcast and authentication. We recommend the use of authentication to ensure that an attacker cannot compromise your synchronization.
Continued on next page.
Troubleshooting Tool Kit for JUNOS Platforms • Chapter 3–71
Troubleshooting JUNOS Platforms
JUNOS Software Needs a Reference (contd.)
|
The slide provides a typical NTP-related configuration stanza. While complete |
|
|
coverage of NTP is beyond the scope of this course, note that two machines can |
|
|
synchronize only when their current clocks are relatively close. A boot-server can |
|
|
set a router’s clock at boot time to ensure that it is close enough to later synchronize |
|
|
to the configured time server. You can also issue a set date ntp address |
|
|
command as a substitute for a boot-server. Use the show ntp associations |
|
|
|
Reproduction |
|
command to display synchronization status. |
|
Not |
for |
|
|
|
Chapter 3–72 • Troubleshooting Tool Kit for JUNOS Platforms
Not
Troubleshooting JUNOS Platforms
dump statusReproductionnormally requires a reboot of the PFE before placing the settings into foreffect. You can place the change into effect immediately by issuing a set coredump
Enable Core Dum s
Based f |
dback from JTAC, system and chassis dump-on-panic is now enabled |
by default. D |
p nding upon the JUNOS Software version, you might need to use |
hidden configuration commands to enable core dumps. Note that a change in chassis |
enable command on each PFE component that contains an embedded host. The chassis dump- -panic statement enables core dumps on all PFE components (at reboot).
Configuration Requires Hidden Commands
The slide shows the hidden configuration statements that you need to enable system and chassis core dumps.
Troubleshooting Tool Kit for JUNOS Platforms • Chapter 3–73
Troubleshooting JUNOS Platforms
Not
Because core dumps are so critical when dealing with transient software failures, it is worthwhile to confirm the curr nt s ttings for both system and chassis dumps. Waiting until after a crash to find out that you needed a reboot to enable a PFE core file is no way to begin the day.
forThe fi st code example shows the operator using the shell to obtain kernel parameters via the sysctl command. The output pipes to grep with the match criteria of
Confirming DumpReproductionS ttings
c edump. In this example it is clear that we enabled kernel core dumps.
Con irming PFE Dump Settings
You can establish a vty or cty connection from the RE to an embedded host on the PFE to confirm the dump status of a given PFE component by issuing a show coredump command. In this example, we see a confirmation that we enabled core dumps for an M10i router’s CFEB. Note that this setting tells the PFE component that it should place a copy of its core file onto the RE’s /var/tmp directory, where you can easily access the core file.
Chapter 3–74 • Troubleshooting Tool Kit for JUNOS Platforms