Regarding (1)... have you defined an lp_clock_gating_test_signal attribute? This attribute tells the tool what to connect to the test pin of the CG cell. Usually, you set this attribute to your shift_enable:
define_dft shift_enable -active high -name SE ....
set_attr lp_clock_gating_test_signal SE /des*/*
If you didn't do this, then it is not surprising that DFT rule checks fail after clock gating is inserted, because a clock gater without a test "override" creates an uncontrollable clock.
Be sure you are running check_dft_rules BEFORE synthesis, and be sure it is clean. You can check DFT rules again after synthesis, but if you don't run it before, your flops will NOT be mapped to scan flops.
Regarding (2), I don't often use clock_gating insert_in_netlist. This tends to result in poor clock gating coverage, and is only recommended if you don't have the option of starting from RTL.