|
| |||||||||
| Tags: netlogon, ntds, paused, rollback, usn |
![]() |
| | Thread Tools | Search this Thread |
|
#1
| |||
| |||
| USN Rollback, NTDS General Errors, and Paused NetLogon
Hello, Our Active Directory is on the fritz. We have 7 domain controllers spread over 3 domains in a single forest. Due to a problem over the weekend, we restored all of the domain controllers to a point-in-time backup. Since then, we appear to be in a USN rollback condition (see KB 875495). We consolidated the FSMO roles and demoted 4 DCs, leaving one Dc per domain. We waited for replication, then added the DCs back. The DCs continue to get NTDS General Event ID 2103, "The Active Directory database has been restored using an unsupported restoration procedure." NetLogon service starts up paused. Any idea what to do next? Thanks in advance, J Wolfgang Goerlich Microsoft Article 875495, "How to detect and recover from a USN rollback in Windows Server 2003" http://support.microsoft.com/?kbid=875495 |
|
#2
| |||
| |||
| Re: USN Rollback, NTDS General Errors, and Paused NetLogon
Hello. Can you tell us some about how you took your backups, and how you restored them? Were these image based backups? Also, of the DCs that were left after your mass demotion/promotion, who did you restore and who did you leave alone? I wans't sure if you restored all DCs or a subset, or.... Thanks! ~Eric -- Eric Fleischman [MSFT] These postings are provided "AS IS" with no warranties, and confers no rights. <jwgoerlich@gmail.com> wrote in message news:1121869455.888573.154490@o13g2000cwo.googlegroups.com... > Hello, > > Our Active Directory is on the fritz. We have 7 domain controllers > spread over 3 domains in a single forest. Due to a problem over the > weekend, we restored all of the domain controllers to a point-in-time > backup. Since then, we appear to be in a USN rollback condition (see KB > 875495). > > We consolidated the FSMO roles and demoted 4 DCs, leaving one Dc per > domain. We waited for replication, then added the DCs back. > > The DCs continue to get NTDS General Event ID 2103, "The Active > Directory database has been restored using an unsupported restoration > procedure." NetLogon service starts up paused. > > Any idea what to do next? Thanks in advance, > > J Wolfgang Goerlich > > > Microsoft Article 875495, "How to detect and recover from a USN > rollback in Windows Server 2003" > http://support.microsoft.com/?kbid=875495 > |
|
#3
| |||
| |||
| Re: USN Rollback, NTDS General Errors, and Paused NetLogon
Hello Eric, We are on a SAN. Before the upgrade, I rebooted one DC at a time. While the DC was down, I made a point-in-time backup (actually, a block-level image) on the SAN. I did this for all of the DCs, one after the other, with the maximum time between the first and last backup being 18 minutes. No changes were made during those 18 minutes. We restored the DCs approximately 15 hours later. We had a black out period and were able to take all of the DCs down at once. We restored to the SAN backups. We booted up the PDCs first (which also hold the GCs), and then the subsequent DCs. The next day, say about 24 hours after the restore, we diagnosed the USN rollback. Repadmin /showutdvec showed that the PDCs had the highest USN. We moved the FSMO roles from the other DCs to the three PDCs. We then demoted the four DCs, waited for replication (around four hours to be safe), and then began promoting one DC at a time. The first was fine. The second and third DC still started with NTDS General Event ID 2103. At this point, adding the forth back in is on hold. Appreciate the response. Hope this clarifies the situation. J Wolfgang Goerlich |
|
#4
| |||
| |||
| Re: USN Rollback, NTDS General Errors, and Paused NetLogon
First, I'd like to start by pointing out that what was done is explicitly against the "rules" of AD. Rolling back a DC w/o using proper procedures (either the standard backup/restore procedures, or VSS + ad writer and the like sort of mechanisms) results in exactly where we are here. In some cases, this can be exceptionally painful in trying to fix it. Sometimes the forest is never the same. Really, saying that we need to use the appropriate backup/restore procedures is not just a line. :) We really do, or the replication model suffers. Replication has no way of knowing that a DC was rolled back (we can't tell that the drives were swapped like this), which is what we have made our VSS provider handle. Please, please don't do this again. The article you cited has info on this. :) Ok, I feel better now that I've gotten that off of my chest.... :) Back to the real issue at hand. Are there other domains? When you said you went to just the PDC, was this the only DC in the entire forest, or were there others around? If others can you tell us about what else is out there? When others were down, did you metadata clean them up? What procedures were followed. Thanks! ~Eric -- Eric Fleischman [MSFT] These postings are provided "AS IS" with no warranties, and confers no rights. <jwgoerlich@gmail.com> wrote in message news:1121950697.473457.240180@g43g2000cwa.googlegroups.com... > Hello Eric, > > We are on a SAN. Before the upgrade, I rebooted one DC at a time. While > the DC was down, I made a point-in-time backup (actually, a block-level > image) on the SAN. I did this for all of the DCs, one after the other, > with the maximum time between the first and last backup being 18 > minutes. No changes were made during those 18 minutes. > > We restored the DCs approximately 15 hours later. We had a black out > period and were able to take all of the DCs down at once. We restored > to the SAN backups. We booted up the PDCs first (which also hold the > GCs), and then the subsequent DCs. > > The next day, say about 24 hours after the restore, we diagnosed the > USN rollback. Repadmin /showutdvec showed that the PDCs had the highest > USN. We moved the FSMO roles from the other DCs to the three PDCs. We > then demoted the four DCs, waited for replication (around four hours to > be safe), and then began promoting one DC at a time. The first was > fine. The second and third DC still started with NTDS General Event ID > 2103. At this point, adding the forth back in is on hold. > > Appreciate the response. Hope this clarifies the situation. > > J Wolfgang Goerlich > |
|
#5
| |||
| |||
| Re: USN Rollback, NTDS General Errors, and Paused NetLogon
> First, I'd like to start by pointing out that what was done is explicitly > against the "rules" of AD. Ah, but it *worked* in the test environment. Note that all DCs were rolled back simultaneously to backups that occurred with a maximum 18 minute delta and that contained no explicit AD changes. (Had to get it off of my chest <grin>). Of course, you are right: this did not work in production. Ok, ok, no more trying to be fancy. > Are there other domains? Three domains total, one forest. > When you said you went to just the PDC, was this the only DC in the entire forest, > or were there others around? We went down to one PDC per domain, three total PDCs online. > When others were down, did you metadata clean them up? No, did not do a metadata cleanup. The other DCs removed cleanly. When the second two DCs came up with problems, I ran an integrity check, soft recovery, and set the registry for a non-authoratitive restore (BurFlags). This did not help. Much obliged for the help, J Wolfgang Goerlich |
|
#6
| |||
| |||
| Re: USN Rollback, NTDS General Errors, and Paused NetLogon
The problem is that you don't always hit every condition in test that could arise when you do this. In fact, given that your change rate is different (and internal change rate within AD is different), it won't be the same. So no matter what you saw in test, it is still considered to be a high risk operation. When you were down to just the three DCs, did you have USN rollback there? Ideal would be going back to bare-bones numbers, ensuring complete end to end health with them, then building back from there. Did you take repadmin to this at all and make any modifications (specifically any with the /sync switch)? ~Eric -- Eric Fleischman [MSFT] These postings are provided "AS IS" with no warranties, and confers no rights. <jwgoerlich@gmail.com> wrote in message news:1122033931.137415.111040@g49g2000cwa.googlegroups.com... >> First, I'd like to start by pointing out that what was done is explicitly >> against the "rules" of AD. > > Ah, but it *worked* in the test environment. Note that all DCs were > rolled back simultaneously to backups that occurred with a maximum 18 > minute delta and that contained no explicit AD changes. (Had to get it > off of my chest <grin>). > > Of course, you are right: this did not work in production. Ok, ok, no > more trying to be fancy. > >> Are there other domains? > > Three domains total, one forest. > >> When you said you went to just the PDC, was this the only DC in the >> entire forest, >> or were there others around? > > We went down to one PDC per domain, three total PDCs online. > >> When others were down, did you metadata clean them up? > > No, did not do a metadata cleanup. The other DCs removed cleanly. When > the second two DCs came up with problems, I ran an integrity check, > soft recovery, and set the registry for a non-authoratitive restore > (BurFlags). This did not help. > > Much obliged for the help, > > J Wolfgang Goerlich > |
|
#7
| |||
| |||
| Re: USN Rollback, NTDS General Errors, and Paused NetLogon
> When you were down to just the three DCs, did you have USN rollback there? Yes, on the PDC in the root domain. The two other PDCs were fine. > Ideal would be going back to bare-bones numbers, ensuring complete end to > end health with them, then building back from there. Agreed. However, we cannot do this w/o losing domain objects. Given the size of our domains, this is not an option. > Did you take repadmin to this at all and make any modifications > (specifically any with the /sync switch)? No. Should I have? Thanks again, J Wolfgang Goerlich |
|
#8
| |||
| |||
| Re: USN Rollback, NTDS General Errors, and Paused NetLogon
The ideal comment was specific to what you did already.....fewer DCs. So you went back to just the 3 PDCs, which makes life far easier to troubleshoot. When you had just those 3 DCs, what NC did you get usn rollback for? -- Eric Fleischman [MSFT] These postings are provided "AS IS" with no warranties, and confers no rights. <jwgoerlich@gmail.com> wrote in message news:1122053658.371186.73680@g43g2000cwa.googlegroups.com... >> When you were down to just the three DCs, did you have USN rollback >> there? > > Yes, on the PDC in the root domain. The two other PDCs were fine. > >> Ideal would be going back to bare-bones numbers, ensuring complete end to >> end health with them, then building back from there. > > Agreed. However, we cannot do this w/o losing domain objects. Given the > size of our domains, this is not an option. > >> Did you take repadmin to this at all and make any modifications >> (specifically any with the /sync switch)? > > No. Should I have? > > Thanks again, > > J Wolfgang Goerlich > |
|
#9
| |||
| |||
| Re: USN Rollback, NTDS General Errors, and Paused NetLogon
Sorry that question wasn't clear. More generally, what were the errors when you were in that state of very few DCs? Do you have the logs from that time that I could look at? -- Eric Fleischman [MSFT] These postings are provided "AS IS" with no warranties, and confers no rights. "Eric Fleischman [MSFT]" <efleis@online.microsoft.com> wrote in message news:%23LCO1WujFHA.2180@TK2MSFTNGP15.phx.gbl... > The ideal comment was specific to what you did already.....fewer DCs. So > you went back to just the 3 PDCs, which makes life far easier to > troubleshoot. > > When you had just those 3 DCs, what NC did you get usn rollback for? > > > -- > Eric Fleischman [MSFT] > These postings are provided "AS IS" with no warranties, and confers no > rights. > > > > <jwgoerlich@gmail.com> wrote in message > news:1122053658.371186.73680@g43g2000cwa.googlegroups.com... >>> When you were down to just the three DCs, did you have USN rollback >>> there? >> >> Yes, on the PDC in the root domain. The two other PDCs were fine. >> >>> Ideal would be going back to bare-bones numbers, ensuring complete end >>> to >>> end health with them, then building back from there. >> >> Agreed. However, we cannot do this w/o losing domain objects. Given the >> size of our domains, this is not an option. >> >>> Did you take repadmin to this at all and make any modifications >>> (specifically any with the /sync switch)? >> >> No. Should I have? >> >> Thanks again, >> >> J Wolfgang Goerlich >> > > |
|
#10
| |||
| |||
| Re: USN Rollback, NTDS General Errors, and Paused NetLogon
> More generally, what were the errors when you were in that state of very few > DCs? Do you have the logs from that time that I could look at? The only errors were on the root domain PDC. On bootup, this logs NTDS General Event ID 2103 and pauses the NetLogon service. Curiously, the second DC in the same domain works fine. Even more curious, the second DC in the child domain gets 2103 even though its PDC works fine. I have logs but that is basically it. J Wolfgang Goerlich |
|
#11
| |||
| |||
| Re: USN Rollback, NTDS General Errors, and Paused NetLogon
Can you email me the logs please? Drop the "online" from my address. -- Eric Fleischman [MSFT] These postings are provided "AS IS" with no warranties, and confers no rights. <jwgoerlich@gmail.com> wrote in message news:1122057225.188539.4120@o13g2000cwo.googlegroups.com... >> More generally, what were the errors when you were in that state of very >> few >> DCs? Do you have the logs from that time that I could look at? > > The only errors were on the root domain PDC. On bootup, this logs NTDS > General Event ID 2103 and pauses the NetLogon service. Curiously, the > second DC in the same domain works fine. Even more curious, the second > DC in the child domain gets 2103 even though its PDC works fine. > > I have logs but that is basically it. > > J Wolfgang Goerlich > |
|
#12
| |||
| |||
| Re: USN Rollback, NTDS General Errors, and Paused NetLogon
> Can you email me the logs please? Certainly. The logs are on their way. J Wolfgang Goerlich |
|
#13
| |||
| |||
| Re: USN Rollback, NTDS General Errors, and Paused NetLogon
I just wanted to comment on this part, I don't need to talk about the other part because ~Eric is one of the most qualified to help with it, if he can't help you, you are in a world of pain. Anyway, working in test does not make it ok to do in production, doing this kind of thing is still against the "rules". Test is good for testing things that are supposed to work but you still need a little confidence boost, say like a schema change. The fact that your image based backup of AD worked in test only says one thing, you were lucky, though maybe it doesn't even say that because you went and did it in production. You have to keep in mind that AD is a single distributed system. It should not be thought of as a simple collection of servers. As such, if you have an idea of backing it up in a way that MS says don't do, at the very least, do a complete shutdown of every system involved prior to the back up so you are truly at a dead nothing changing state. That gets you some chance of possibly succeeding. Basically, just because you didn't make any changes doesn't mean changes aren't being made and replicated. AD is a livng system that is constantly updating AD attributes on its own without any guidance from you. Also if you have things like Exchange or other directory aware apps, they can be making changes as well that you have no knowledge of. -- Joe Richards Microsoft MVP Windows Server Directory Services www.joeware.net jwgoerlich@gmail.com wrote: >>First, I'd like to start by pointing out that what was done is explicitly >>against the "rules" of AD. > > > Ah, but it *worked* in the test environment. Note that all DCs were > rolled back simultaneously to backups that occurred with a maximum 18 > minute delta and that contained no explicit AD changes. (Had to get it > off of my chest <grin>). > > Of course, you are right: this did not work in production. Ok, ok, no > more trying to be fancy. > |
|
#14
| |||
| |||
| Re: USN Rollback, NTDS General Errors, and Paused NetLogon |
![]() |
|
| Thread Tools | Search this Thread |
| |
Similar Threads for: "USN Rollback, NTDS General Errors, and Paused NetLogon" | ||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| netlogon service paused at dC startup | stingray | Active Directory | 15 | 29-12-2009 05:22 AM |
| Disk write caching errors - Ntds general 1539 & Ntfrs 13512 | johnny_mango | Active Directory | 0 | 31-01-2008 09:33 PM |
| Netlogon paused | S | Active Directory | 3 | 05-04-2007 08:50 PM |
| Error NTDS General Global Catalog 1126 | MartinH | Active Directory | 9 | 20-06-2006 07:07 AM |
| Event ID 1168 NTDS General | Mike | Active Directory | 2 | 08-05-2006 07:22 PM |